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(54) Binaural synthesis, head-related transfer functions, and uses thereof 



(57) The invention relates to improved methods and 
apparatus for simulating the transmission of sound from 
sound sources to the ear canals of a listener, said sound 
sources being positioned arbitrarily in three dimensions 
in relation to the listener. In particular, the invention 
relates to new and improved methods for measurement 
of Head-related Transfer Functions, new and improved 
Head -related Transfer Functions, new and improved 
methods for processing Head-related Transfer Func- 
tions, and new methods of changing, or of maintaining, 
the directions of the sound sources as perceived by a 
listener. The measurement method have been improved 
so that it is now possible to measure and/or construct 
Head-related Transfer Functions for which the time 
domain descriptions are surprisingly short and for which 
the differences from one individual to the other are sur- 
prisingly low. 

The new Head-related Transfer Functions can be 
exploited in any application concerning simulation of 
sound transmission, e.g. auralization of concert halls, 
measurement, simulation, or reproduction of sound, 
such as in binaural synthesis, e.g. for generation, by 
means of two sound sources, such as by headphones 
or by two loudspeakers, the perception of a listener that 



he is listening to sound generated by a multichannel 
sound system, such as a surround system, a quadra- 
phonic system, a stereophonic system, etc, in the 
design of electronic filters used in. e.g. virtual reality 
systems, to simulate sound transmission from a virtual 
sound source to the ear canals of the listener, or, in the 
design of an artificial head that is designed so that its 
Head-related Transfer Functions approximate the Head- 
related Transfer Functions of the invention as closely as 
possible in order to make the best possible representa- 
tion of humans by the artificial head. e.g. to make artifi- 
cial head recordings of optimum quality. 
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Description 

FIELD OF THE INVENTION 

[0001] The present invention relates to improved 
methods and apparatus for simulating the transmission 
of sound from sound sources to the ear canals of a lis- 
tener, said sound sources being positioned arbitrarily in 
three dimensions in relation to the listener. In particular, 
the invention relates to novel uses of certain Head- 
related Transfer Functions and the production of such 
Head-related Transfer Functions, as well as to methods 
and apparatus using the Head-related Transfer Func- 
tions. 

BACKGROUND OF THE INVENTION 

[0002] Human beings detect and localize sound 
sources in three-dimensional space by means of the 
human binaural sound localization capability. 
[0003] The input to the hearing consists of two signals: 
sound pressures at each of the eardrums. These two 
sound signals are called binaural sound signals. The 
term binaural refers to the fact that a set of two signals 
form the Input to the bearing. It is not fully known how 
the hearing extracts information about distance and 
direction to a sound source, but it is known that the 
hearing uses a number of cues in this determination. 
Among the cues are coloration, interaural time differ- 
ences, interaural phase differences and interaural level 
differences. Thorough descriptions of cues to direc- 
tional hearing are given by J. Blauert: "Raumliches 
Hbren", Hirzel Verlag, Stuttgart. Germany, 1974 and 
"Spatial Hearing", The MIT Press. Cambridge. MA, 
1983. 

[0004] This means that if the sound pressures at the 
eardrums are created exactly as they would have been 
created by a given spatial sound field, a listener would 
not be able to distinguish this sound experience from 
the one he would get from being exposed to the spatial 
sound field Itself. 

[0005] One known way of approaching this ideal 
sound reproducing situation is by the artificial head 
recording technique. An artificial head is a model of a 
human head where the geometries of a human being 
which are acoustically relevant especially with respect 
to diffraction around the body, shoulder, head and ears 
are modelled as closely as possible. During a recording, 
e.g. of a concert, two microphones are positioned in the 
ear canals of the artificial head to sense sound pres- 
sures, and the electrical output signals from these 
microphones are recorded 

[0006] When these signals are reproduced, e.g. by 
headphones, the sound pressures in the ear canals of 
the artificial head during the concert are reproduced in 
the ear canals of the listener and the listener will 
achieve the perception that he was listening to the con- 
cert in the concert hall. The signals for the headphones 



are also called binaural signals. 

[0007] The term binaural signals designates a set of 
two signals, left and right, having been coded using 
transmission characteristics corresponding to the trans- 
5 mission to the two ears of the human listener, for 
instance to be presented In the left and right ear canals, 
respectively, of a listener. 

[0008] The binaural signals may typically be electrical 
signals, but they may also be. e.g. optical signals, elec- 

10 tromagnetic signals or any other type of signal which 
can be transformed, directly or indirectly, into sound sig- 
nals in the left and right ears of a human. 
[0009] The transmission of a sound wave propagating 
from a sound source positioned at a given direction and 

15 distance in relation to the left and right ears of the lis- 
tener is described in terms of two transfer functions, one 
for the left ear and one for the right ear, that Include any 
linear distortion, such as coloration, interaural time dif- 
ferences and Interaural spectral differences. These 

20 transfer functions change with direction and distance of 
the sound source in relation to the ears of the listener. It 
is possible to measure the transfer functions for any 
direction and distance and simulate the transfer func- 
tions, ag. electronically, e.g. by filters. If such filters are 

25 inserted in the signal path between a playback unit such 
as a tape recorder and headphones used by a listener, 
the listener will achieve the perception that the sounds 
generated by the headphones originate from a sound 
source positioned at the distance and in the direction as 

30 defined by the transfer functions of the fitters, because 
of the true reproduction of the sound pressures in the 
ears. 

[001 0] A set of two such transfer functions, one for the 
left ear and one for the right ear. is called a Head-related 

35 Transfer Function (HTF). Each transfer function Is 
defined as the ratio between a sound pressure p gener- 
ated by a plane wave at a specific point in or close to the 
appertaining ear canal (Pl in the left ear canal and pR in 
the right ear canal) in relation to a reference. The refer- 

40 ence traditionally chosen is the sound pressure p^ gen- 
erated by a plane wave at a position right in the middle 
of the head, but with the listener absent. In the fre- 
quency domain this HTF is given by: 

45 Hl = Pl/Pi.Hr«Pr/P, (1) 

where L designates the left ear and R designates the 
right ear. The time domain representation or description 
of the HTF, that Is the Inverse Fourier transform of the 

so HTF. is often called the Head-related Impulse Response 
(HIR). Thus, the time domain description of the HTF is a 
set of two impulse responses, one for the left ear and 
one for the right ear. each of which is the inverse Fourier 
transform of the corresponding transfer function of the 

55 set of two transfer functions of the HTF in the frequency 
domain. 

[001 1 ] The HTF depends upon the angle of incidence 
of the plane wave In relation to the listener. It gives a 
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complete description of the sound transmission to the 
ears of the listener, including diffraction around the 
head, reflections from shoulders, reflections in the ear 
canal, etc. 

[001 2] The definitions given in equation (1 ) were given 
by J. Blauert: "Raumliches H6ren", Hirzel Verlag, Stutt- 
gart. Germany, 1974. 

[001 3] A tutorial about binaural techniques is given by 
Henrik Moller: "Fundamentals of Binaural Technology", 
Applied Acoustics No. 3/4, pp. 171-218, vol. 36. 1992, 
[0014] As mentioned above, binaural signals may be 
generated using the artificial head recording and repro- 
ducing technique; the artificial head could be substi- 
tuted with a test person. 

[001 5] Alternatively, binaural signals may be gener- 
ated by any means that simulate the transmission of 
sound to the ear canals of humans, such as analog fil- 
ters, digital filters, signal processors, computers, etc. 
[0016] U.S. Patent No 3.920.904 discloses a method 
for creating sound pressures at the eardrums of a lis- 
tener by means of headphones, that correspond to 
sound pressures which would be created at the ear- 
drums of the listener in a predetermined acoustical 
environment in response to electrical signals applied to 
a number of loudspeakers, comprising measurement of 
the HTFs con-espondlng to the positioning of the loud- 
speakers in relation to the listener and simulation of the 
HTFs with analog electronic filters. 
[0017] It has also been claimed to be possible to 
design the simulating filters using a different approach 
that does not include a measurement of HTFs but relies 
on knowledge of specific cues to directional hearing. 
Such an approach is disclosed in US 4.817,149, where 
a front/back cue Is generated by a spectral bias, eleva- 
tion by a notch filter, and azimuth by a time-shift 
between the two channels. 

BRIEF DISCLOSURE OF THE INVENTION 

[0018] The present invention is based on intensive 
research in the field of binaural techniques and provides 
high quality HTFs as well as a number of other improve- 
ments of the binaural techniques and other techniques 
in which HTFs are used. 

[0019] Thus, the Invention provides, inter alia, new 
and improved methods for measurement of HTFs, new 
and Improved HTFs, new and improved methods for 
processing HTFs, new methods of changing, or of main- 
taining, the directions of the sound sources as per- 
ceived by a listener, and as one of the most important 
utilizations thereof, new methods for binaural synthesis. 
[0020] One object of the present invention is to pro- 
vide HTFs for which the differences between the gains, 
in the frequency domain, of a HTF from one human to 
another are very low. or the differences between the 
corresponding time domain descriptious of the HTFs 
are very low. The Inventors have carried out a major 
study of a number of HTFs for a number of different indi- 



viduals, for a number of different directions, and for a 
number of different measurement points In the external 
ear of the individual. I.e. inside the ear canal or in the 
vicinity of the entrance to the ear canal. During this 
5 study the inventors have improved the measurement 
method so that it is now possible to measure and/or 
construct HTFs for which the time domain descriptions 
are surprisingly short and for which the differences from 
one individual to the other are surprisingly low. 
10 [0021] According to the present Invention, a group of 
HTFs with advantageous features has been provided 
that can be exploited in any application concerning 
measurement or reproduction of sound, such as in the 
design of electronic filters used in the simulation of 
15 sound transmission from a sound source to the ear 
canals of the listener or in the design of an artificial head 
that is designed so that its HTFs approximate the HTFs 
of the invention as closely as possible in order to make 
the best possible representation of humans by the artifi- 
20 cial head. e.g. to make artificial head recordings of opti- 
mum quality. 

[0022] Further, the present invention provides meth- 
ods of extracting or constructing, for each direction of a 
sound source in relation to the listener, a function that 

25 represents the human HTFs of a group of humans 
which function can be used as the design target in dif- 
ferent applications, such as the design of an artificial 
head or the design of signal processing means. 
[0023] Still further, the present invention provides a 

30 new method of interpolation whereby a virtual distance 
and direction of a virtual sound source can be created 
based upon transfer functions corresponding to different 
directions. 

35 DETAILED DISCLOSURE OF THE INVENTION 

[0024] One main aspect of the invention relates to a 
method of generating binaural signals by filtering at 
least one sound input with at least one set of two filters. 
40 each set of two filters having been designed so that the 
two filters simulate the left ear and the right ear parts of 
a Head-related Transfer Function (HTF), the method 
showing at least one of the features a) - c) 

45 a) the HTF is used generally for a population of 
humans for which the binaural signals are intended, 
the HTF being determined in such a manner that 
the standard deviation of the amplitude, in dB. 
between subjects, over at least a major part of the 

50 frequency interval between 1 kHz and 8 kHz Is at 
the most as shown in Fig. 22 for at least one of the 
curves thereof. 

b) the duration of the time domain representation of 
55 the transfer function of the filters simulating the HTF 

is at the most 2 ms, 

c) the value at zero Hertz of the frequency domain 



4 



RW.Qnnnin- 



091207SAP I > 



5 



EP 0 912 076 A2 



6 



description of the transfer function of the filters sim- 
ulating the HTF is in the range from 0.316 to 3.16. 

[0025] With respect to feature a): 
[0026] An important aspect of the invention relates to 
the utilization of "general** HTFs in binaural synthesis. 
The term "general" refers to the very desirable fact that 
it is now possible to generate binaural signals using 
"general" HTFs that typically differ from the HTFs of a 
listener and still provide to the listener a high quality 
auditive experience with a high quality of sound repro- 
duction and a distinct localization of the virtual sound 
sources. A "general" HTF or a set of "general" HTFs can 
be defined as an HTF for an individual subject of a pop- 
ulation or a set of HTFs for individual subjects of a pop- 
ulation, for a particular angle of sound incidence, the 
HTF or HTFs being determined in such a manner that 
the standard deviation of the amplitude, in dB. between 
subjects, over at least a major part of the frequency 
interval between 1 kHz and 8 kHz is at most as shown 
in Figs. 22-24 for at least one of the curves the of the fig- 
ure in question. In the present context, the term "over a 
major part of the frequency interval** indicates that in the 
logarithmic representation of Figs. 22-24. the standard 
deviation will be at the most a value identical to the 
value of the curve at the frequency in question over a 
major part of the frequency interval, seen in the same 
logarithmic representation. In other words, the condition 
is complied with when, over at least 51% of the millime- 
tres of X axis representing the frequency range between 
1 kHz and 8 kHz, the standard deviation is less than or 
at the most identical to the value represented by the 
curve in question. This definition does not indicate that 
the standard deviation will be higher than the curve 
value in the range of 100 Hz to 1 kHz which is also 
shown in the figures - it will always or almost always be 
lower than the curve value or at the most identical with 
the curve value, but the definition focuses on the part of 
the curve, between 1 kHz and 8 kHz. which is much 
more critical with respect to "generality". It is. of course, 
preferred that the condition is complied with over a 
higher proportion of the frequency range, such as at 
least 75% or at least 90%. and most preferred that it is 
conplied with at all frequencies such as is the case in 
the results reported herein, but even the least stringent 
condition defined above will represent a high degree of 
generality. 

[0027] As appears from Figs. 22-24 and the apper- 
taining discussion, extremely low variations can be 
obtained and have been obtained between subjects, in 
particular for the most important angles of sound inci- 
dence. This means that "general" high quality HTFs can 
now be used for all the various purposes for which HTFs 
are used, thus very significantly Increasing the practical 
commercial usefulness of HTFs and techniques related 
thereto, such as binaural techniques, in particular bin- 
aural synthesis. 

[0028] As the anatomy of humans shows a substantial 



variability from one individual to the other and as the 
HTFs of a human among other things are determined by 
diffractions and reflections around the head and pinna 
and the transmission characteristics through the ear 

5 canals, it is intuitively understood that the HTFs are dif- 
ferent for different individuals. In the prior art. these dif- 
ferences are considered to be large. Experiments have 
been performed where binaural signals have been gen- 
erated using HTFs from another person than the lis- 

10 tener, whereby the listeners auditive experience have 
been disappointing, among other things due to a dimin- 
ished ability of localizing the virtual sound sources from 
the binaural signal. Thus, in the art. the variability of 
HTFs among humans is considered to be a major 

15 impediment for the use of one set of HTFs for different 
listeners. For example, it is reported that: "Substantial 
intersubject variability in the HRTF for a single source 
position is to be expected, given differences in head 
size and pinna shape. This HRTF variability has been 

20 reported before (Shaw 1966) and is prominent in our 
data. (..) Rg. 3 shows that variability in HRTF from sub- 
ject to subject grows with frequency until it reaches a 
peak of almost 8 dB between 7 and 10 kHz". F.L Wight- 
man and D. Kistler. **Headphone Simulation of Free- 

25 Field Listening, I: Stimulus Synthesis, II: Psychoacous- 
tical Validation,'* J. Acoust. Soc. Am. Vol. 85(2), pp. 858- 
878. 1989. The data reported are 1/3 octave noise 
bands values. 

[0029] However, it is a major achievement of the 

30 present invention that it has now been found that it is 
possible to provide or determine an HTF (A) for a partic- 
ular angle of sound incidence which is so close to corre- 
sponding individual HTFs that the function HTF (A) will 
satisfy even critical quality demands by almost all poten- 

35 tial users for which the function is intended, in contrast 
to the widespread belief in the art that HTF would have 
to be adapted to the individual user to achieve a satis- 
factory quality in the practical uses of the HTF In prac- 
tice, this will mean that the use according to the 

40 invention of the HTF (A) will result in a higher quality in 
almost all situations of use, and thus a general improve- 
ment. This is illustrated in more detail later in the 
description with reference to Fig. 8. 
[0030] The ability of the HTF (A) to be close to con-e- 

45 sponding individual HTFs, or, expressed in another 
manner, to be member of a group of HTFs determined 
with a low standard deviation, is quantitatively described 
by the conditions mentioned above with respect to Figs. 
22-24. The HTFs are considered to have the quality of 

50 generality when the standard deviation is at the most as 
shown in Fig. 22 for at least one of the appropriate 
curves of Fig. 22. 

[0031] The properties of the HTF complying with the 
criteria of Fig. 22 for a populatton. such as. e.g.. U.S. 
55 astronauts or Scandinavian teenagers, or, quite gener- 
ally, a population for which the product of the binaural 
synthesis is intended or primarily intended, can. thus, 
also be expressed by the square root of the mean of the 
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squared differences between 



the amplitude, given in dB for third octave noise, of 
the HTF 

and 5 

the amplitudes, given in dB for third octave noise for 
a group of randomly selected individual HTFs of the 
population, being at the most 2.2 times the stand- 
ard deviation as shown in Fig. 8 for the majority of io 
the third octave frequencies shown, preferably at 
the most 1.7 times the standard deviation as shown 
in Fig, 8, more preferably at the most 1 .4 times the 
standard deviation as shown in Fig. 8. and most 
preferably at the most 1.2 or even 1.1 times the is 
standard deviation as shown in Fig, 8. 

[0032] In the assessment of whether an HTF fulfils 
these "generality" qualities, the individual HTFs (of a 
representative number of individuals of the population) so 
to be compared with the HTF in question could be deter- 
mined for a particular angle of sound incidence, a par- 
ticular distance, a particular reference point for the 
HTFs. and a particular posture, the determination being 
performed so that the repeatability of the measurement, 25 
expressed in terms of standard deviation of the ampli- 
tude, in dB, between repeated measurements, is at the 
most V2 times the standard deviation shown in Fig. 8. 
The assessment will, of course, be most appropriate 
and valuable if providing such parameters with respect 30 
to sound incidence, reference point and posture which 
correspond to the ones used In the original determina- 
tion of the HTF or the ones which the HTF is adapted to 
simulate. While the description which follows discloses 
a number of specific methods for measuring and/or con- 35 
structing HTFs so that they will comply with the general- 
ity criterion, the above assessment principle can be said 
to be a general way of judging the suitability of a candi- 
date HTF for a particular use. or of judging whether an 
HTF implemented for a particular use is within the 40 
scope of the present invention. 

[0033] While partial or full conformity, as discussed 
above, with the criteria illustrated in Fig. 22 can be said 
to be a basic requirement for the "generality" of an HTF. 
it is preferred that the HTFs fulfil, at least with respect to 45 
one of the curves, the more stringent criteria illustrated 
in Fig. 23 or even, at least with respect to one of the 
curves, the still more stringent criteria illustrated in Fig. 
24. It should be noted that the reason why the curves 
relating to the 1/3 octave measurement are positioned so 
lower than the pure tone curves is that the 1/3 octave 
curves are frequency averages. It will be understood 
that analogously to the criteria of Fig. 22. it is preferred, 
on each level of increasing stringency as defined by Fig. 
23 and Fig. 24. that the HTFs fulfil the criteria for at least 55 
one of the appropriate curves of the figure in question. 
[0034] It will be understood that while the above con- 
ditions or criteria define "general" HTFs for a broad pop- 



ulation, there are certain evident criteria for what 
constitutes a population in the sense of the present dis- 
closure, these criteria being associated with the anat- 
omy of the ears and other anatomic characteristics of 
the population. Thus, it is presumed that a set of HTFs 
determined for a group of adults will not be optimal 
"general" HTFs for a population of small children. How- 
ever, this does not introduce any uncertainty in the 
present context, as it has been found, as discussed 
above, that the generality criteria for a particular popula- 
tion will be fulfilled when the criteria of Fig. 22, prefera- 
bly Fig. 23 and more preferably Fig. 24 are fulfilled for 
the population in question, that is. when an assessment 
as discussed above has been made on a representative 
(with respect to number and variation) subpopulation of 
the population in question, e.g. 25 persons of the popu- 
lation, or preferably more persons. 
[0035] With respect to feature b) : 
[0036] According to the invention, it has surprisingly 
been found that it is possible, without any significant 
loss in quality, to reduce the duration of the time domain 
representation of high quality HTFs, i.e. high quality 
HIRs, used in binaural synthesis to 2 ms or even lower. 
This will very considerably reduce the demands to com- 
puter power when simulating the HTFs. When generat- 
ing binaural signals, a sound input signal is typically 
convoluted with the HIR. The terms "the duration of the 
time domain representation of a HTF" or equivalently 
"the duration of the HIR" refer to the length in time of 
that part of the HIR that is used for convolution of the 
sound input signal. Reduction of the duration of the time 
domain representation of a HTF or equivalently reduc- 
tion of the duration of the HIR refers to the fact that a 
shorter part of the HIR is used for the convolution of the 
sound input signal. As short HTFs (or HIRs) have been 
provided according to the present invention, high quality 
HTFs implemented by means of digital filters can now 
be handled by moderate computing resources. The time 
domain representations of HTFs reported In the prior art 
range from 2.9 ms and up. When evaluating the duration 
of Head-related Impulse Responses It Is important to 
study its frequency response. Examples are reported 
where an apparently short pulse can not be truncated to 
less than a few milliseconds as the truncation changes 
its frequency response to an unacceptable extent 
because the impulse contains essential information 
over a longer time duration. It has been found that this is 
not the case for the high quality impulses determined as 
disclosed herein or otherwise complying with the criteria 
underlying the present invention, as illustrated below 
with reference to Fig. 9 and Fig. 10. 
[0037] The quality of the HTFs obtained by the inven- 
tors have been proven by experiments wherein trun- 
cated versions of the HTFs obtained have been used for 
binaural synthesis. A panel of listeners have compared 
sound reproductions based on the truncated and the 
non-truncated versions of the same HTF and it was 
found that the HTFs obtained by the inventors could be 
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truncated to the durations mentioned above without loss 
of quality of the audible impression perceived by the lis- 
tener, the listening test being a three-alternative-forced- 
choice test. It will be understood that in this aspect of 
the invention, this kind of test is a general test which can 
be used to assess the truncatability of any HTF. 
[0038] The literature contains disclosures of certain 
short impulses which are not proper HTFs according to 
the general definition. For example transfer functions 
are reported where the pressures p in the ear canals 
are not divided by pi and therefore these measure- 
ments are not measurements of the HTFs but measure- 
ments of the combined transfer functions of the 
loudspeaker and the HTFa 

[0039] While the use of HTFs of duration of 2 ms is 
believed to be unique to the present invention, it has 
been found possible to use even shorter parts of HTFs. 
such as at the most 1.5 ms or shorter, e.g. at the most 
1 .2 ms or 1 ms or even down to at the most 0.9 ms or 
0.75 ms or at the most 0.5 ms. 

[0040] One criterion which should normally be 
observed in connection with the use of such short HTFs 
is that they should comply with certain requirements 
with respect to their DC value, such as described below 
in connection with feature c). While it is possible to use 
Htfs as short as described above without any DC adjust- 
ment, a normal precaution preferred by the inventors as 
a routine measure is to adjust the DC value of the short 
HTFs in accordance with the teaching given in connec- 
tion with feature c). 
[0041 ] With respect to feature c) : 
[0042] According to this feature, the value at zero Hz 
of the frequency domain representation of the HTF is in 
the range from 0.316 to 3.16. preferably in the range 
from 0.5 to 2, such as In the range from 0.7 to 1 .4. more 
preferably in the range from 0.8 to 1 .2, such as in the 
range from 0.9 to 1.1, and most preferably in the range 
from 0.95 to 1 .05, and optimally set to 1 .0. 
[0043] Until the present invention, the value at zero Hz 
of the frequency domain representation of the HTF (the 
DC value of the HTF) seems to have attracted little or no 
attention in the art. However, the research and develop- 
ment of the present inventors has revealed that the DC 
value has a significant influence on the frequency 
domain representation of the HTF thereby influencing 
the sound quality, such as coloration, when the HTF is 
used in sound reproduction. 

[0044] When HTFs have been measured, the DC 
value of the HTF is not measured as sound transducers 
are not able to generate a static sound pressure. There- 
fore, the DC value measured is related to secondary 
characteristics of the measurement set-up that often is 
not accurately controlled, such as DC offsets in the 
measurement amplifiers, and tiie DC values measured 
are not related to the HTFs under measurement. 
[0045] The theoretical DC value of the HTFs is 1 as 
static sound pressure is not altered by the presence of 
the listener. Further, no diffraction occurs around the 



head at low frequencies and therefore the sound pres- 
sures at different points tend to be identical at lower fre- 
quencies. Measuring a value different from 1 
corresponds to adding a constant in the time domain 
5 representation of the HTF or to add a sine function to 
the frequency domain representation of the HTF which 
changes the appearance of the frequency response sig- 
nificantiy, especially at lower frequencies and this 
changes the sound quality when the HTF is used for 
10 binaural synthesis. This is further illustrated below with 
reference to Fig. 11 and Fig. 12. 
[0046] Thus, according to the present invention the 
DC value of the measured HTF is adjusted to be in the 
range from 0.316 to 3.16 preferably in the range from 
15 0.5 to 2, such as in the range from 0.7 to 1 .4, more pref- 
erably in the range from 0.8 to 1.2. such as in the range 
from 0.9 to 1.1, and most preferably in the range from 
0.95 to 1.05. ideally 1, either directiy in the frequency 
domain representation of the HTF or by adding a con- 
20 stant to the time domain representation of the HTF 

[0047] Further, the method of adjusting the DC value 
to be within an adequate range of the correct value of 
the HTF has tiie advantage that the frequency values of 
the HTF between the value of the bwest frequency 
25 measured and zero Hz is interpolated between these 
two value whereas extrapolation has to be used when 
adjustment of tfie DC value is not used and extrapola- 
tion leads to less accurate results and even in some 
cases to very poor results. 
30 [0048] In many applications of the method of the 
invention, it is desired to simulate more than one sound 
source, and thus, for may practical embodiments of the 
method, tiie at least one sound input is filtered with at 
least two sets of two filters, each set of two filters having 
35 been designed so that the two filters simulate the left 
ear and the right ear parts of a Head-related Transfer 
Function (HTF), or with at least three sets of two filters, 
each set of two filters having been designed so that the 
two filters simulate the left ear and the right ear parts of 
40 a Head-related Transfer Function (HTF), and so on for 
at least four sets of two filters, at least five sets, etc. 
[0049] In the following, a number of measures which 
have been found by the inventors to be valuable in the 
measurement and/or construction of HTFs are dis- 
45 cussed. As appears from the discussion, these meas- 
ures, and combinations thereof have resulted in HTFs of 
qualities which must be believed to be hitherto unat- 
tained, and several such HTFs for a number of angles of 
sound incidence are disclosed specifically herein, in 
50 particular in the di-awings. These HTFs and combina- 
tions thereof are believed to be novel ger se and, like the 
novel measures for the measurement and/or construc- 
tion of HTFs, constitute aspects of the present inven- 
tion. As will be understood, these HTFs show the 
55 features identified under a) - c) above and, thus, their 
use constitutes preferred embodiments of the binaural 
synthesis aspect of the invention. However, it will also 
be understood that the invention is not limited to the use 
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of these HTFs or to HTFs measured or constructed 
using the special techniques disclosed herein, but 
encompasses the novel use of any HTF or combination 
of HTFs, irrespective of how it was determined/pro- 
vided, as long as the HTF or the combination shows the 5 
characterizing features defined herein. 
[0050] As described in the above mentioned tutorial 
and by Hammershoi and Moller: "Sound Transmission 
to and within the Human Ear Canal", submitted for the 
Journal of the Acoustical Society of America, December / 0 
1994, the inventors* research and development have 
revealed that the transmission of sound pressures from 
one point to another in the ear canal is independent of 
the angle of sound incidence. The consequence of this 
is that the physical location of a point, where full direc- is 
tional information is present, may be chosen anywhere 
from the eardrum to the entrance of the ear canal. Pos- 
sibly, even points a few millimetres outside the ear canal 
and in line with it. may be used, ft has also been shown 
that full directional information is present at the entrance 20 
to a blocked ear canal. Further, it has been shown by 
the inventors that a major part of the individual differ- 
ences of sound transmission to the eardrums of differ- 
ent humans is caused by individual differences of the 
sound transmission along the ear canal. Therefore, the 2s 
Inventors presently prefer to measure the HTFs at the 
entrance to the blocked ear canal as full directional 
information has been shown to be present at this point 
and the individual differences between the HTFs of dif- 
ferent humans have been estimated to be minimal at so 
this point. 

[0051] According to research of the inventors this is 
related to the fact that measurements at the entrance of 
the blocked ear canal is not related to the remaining 
sound transmission to the eardrum, since statistical 35 
analysis reveal that HTFs measured at the entrance of 
the blocked ear canal is uncorrelated with the remaining 
part of the sound transmission. According to the inven- 
tors this quality is evidently not rhaintained in measure- 
ments at other points in the ear, e.g. at the entrance of 40 
the open ear canal. 

[0052] Measurement at the entrance to the blocked 
ear canal has previously been demonstrated to reduce 
the standard deviation between measurements, but the 
above surprising recognition that it is possible, using 45 
intec alia this measure, to an-ive at "general" HTFs, real- 
istically useful for a population, as contrasted to the indi- 
vidual approach previously believed to be necessary in 
high quality binaural synthesis, is novel and Important. 
[0053] The measurement of sound pressures at the so 
entrance to the blocked ear canal has the further advan- 
tage that it is relatively easy to mount a microphone at 
this point The inventors prefer to integrate the ear plug 
and the microphone. 

[0054] Thus, according to a preferred embodiment of ss 
the invention, the reference point of the HTF or the 
HTFs is at the entrance, or close to the entrance, to the 
blocked ear canal. 



[0055] The reference point (where the measuring 
microphone is arranged) may be outside the ear canel. 
or it may be inside the ear canal. If it is inside the ear 
canal, the blocking of the ear canal is positioned deeper 
in the ear canal. The reference point is normally at most 
0.8 cm from the entrance to the blocked ear canal. More 
preferably, it is at most 0.6 cm from the entrance to the 
blocked ear canal, most preferatrfy at most 0.3 cm from 
the entrance to the blocked ear canal, and ideally just at 
the entrance. Typically, the blocking of the ear canal is 
performed by means of a conventional ear plug, prefer- 
ably of a compressible foam plastic material which, in 
the ear canal, will expand to completely fill out the ear 
canal across. 

[0056] As mentioned above, the present invention pro- 
vides a number of quality improvements of the princi- 
ples according to which HTFs are measured, and the 
conditions under which they are measured. These 
improvements are reflected and manifested in the qual- 
ity and utility of the new HTFs according to the inven- 
tion. Thus, an aspect of the invention relates to the use 
of an HTF that has been established using at least one 
of the following measures a)-h): 

a) the sound pressure P2 from a spatially arranged 
sound source has been measured at the entrance, 
or close to the entrance, to the blocked ear canal of 
a person or of an artificial head. 

b) the sound pressure p^ from the sound source 
has been measured at a position between the ears 
of the test person or of the artificial head, with the 
test person or the artificial head absent, 

c) the frequency domain description of the HTF has 
been calculated by dividing the frequency domain 
description of P2 by the frequency domain descrip- 
tion of Pi . optionally followed by low-pass filtering, 

d) the time domain description of the HTF has been 
obtained by Inverse Fourier transformation of the 
frequency domain description, 

e) for a particular direction in relation to the test per- 
son or the artificial head, the left and right ear parts 
of the HTF have been measured simultaneously 

f) the test person has been standing during the 
measurement of the HTF, 

g) the test person has been monitored by visual 
means such as video to ensure that the position of 
the head of the test person was not changed during 
the measurement of the HTF and/or any measure- 
ment of an HTF during which the position of the 
head differed from, the correct position has been 
discarded. 
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h) the test person himself monitored the position of 
his head e.g. by means of mirrors or a video moni- 
tor in order to keep his head in the correct position 
during measurement of the HTF. 

i) the measurements were carried out in an ane- 
choic chamber, the measurement time for one HTF 
being at the most 5 seconds, preferably at the most 
3 seconds, more preferably at the most 2 seconds, 
such as about 1.5 seconds. 

[0057] In several disclosures of the prior art, the HTFs 
have been measured in an anechoic chamber.by estab- 
lishing a sound field using a loudspeaker as the sound 
source followed by the measurement, frequency by fre- 
quency, of p2 and then of or vice versa. The HTF is 
then calculated by dividing pg by p^. However, this 
method only provides the gain of the HTF and the phase 
remains unknown. 

[0058] Some prior art literature discloses measure- 
ments of the HTFs that do not include measurement of 
p^. This means that the HTFs disclosed are not real 
HTFs but transfer functions that combine the transfer 
function of the loudspeaker used with the transmission 
of sound pressures from the loudspeaker to the point 
where the sound pressures has been measured. If the 
combined transfer function is used to reproduce binau- 
ral sound signals the listener will perceive the sound 
reproduced to be played by this loudspeaker. 
[0059] Thus, it is an important aspect of the invention 
that the sound pressure Pi created by a sound source 
has been measured at a position between the ears of 
the test person, with the test person absent, and the fre- 
quency and time domain representations of the HTF 
have established as described above. 
[0060] The optional low-pass filtering is performed to 
avoid the effect of the relatively low measurement val- 
ues obtained at frequencies close to half the sampling 
frequency mainly defined by the frequency characteris- 
tics of the loudspeakers and microphones and the anti- 
aliasing filters used in the measurement set-up. The 
division of the two sound pressures in this frequency 
range has been seen to create significant peaks and 
valleys in the frequency domain representation of the 
HTF if not followed by the low-pass filtering. 
[0061] The simultaneous measurement of the two 
HTFs (for the left and the right ear) ensures that the 
position and orientation of the head of the test person or 
the artificial head is not changed between measure- 
ment of the HTF and/or that the time references of the 
measurements of the HTF are identical. 
[0062] The fact that the time differences between the 
arrival of sound pressures from a specific sound source 
to the left ear and the right ear of the listener is one of 
the most important parameters in sound localization. It 
is very important to determine this pjarameter, the inter- 
aural time difference, accurately. If the measurement of 
the HTF is not carried out simultaneously for the two 



ears, the ears of the test person has to be kept in the 
same position within millimetres during the two meas- 
urements. For example a movement of 1 cm of the head 
of the test person corresponds to a time difference of 30 

5 |Lis and an uncertainty of the determination of the inter- 
aural time difference of this magnitude will typically 
influence the quality of the HTFs significantly There- 
fore, the inventors have chosen the more practical and 
accurate solution to measure the HTF simultaneously 

10 for the two ears. 

[0063] When performing measurements of HTFs, it is 
most commonly prescribed in the art to use a seated 
test person during measurements as a seated test per- 
son is well supported and thereby in a good position to 

15 keep the head in a fixed position during measurements. 
The disadvantage of this method is that reflections from 
the knees prolong the impulse responses. As the 
present inventors have found no indications contradict- 
ing the general understanding that there is no difference 

20 in sound localization ability of a sitting and a standing 
person they have preferred to use a standing test per- 
son during their measurements to obtain as short 
impulse responses as possible. However, this solution 
requires good support of the position of the test person, 

25 while simultaneously avoiding reflections from the sup- 
porting means. As illustrated in Fig. 6, the test person is 
supported at the lumbar region where the support does 
not cause any sound reflections. Further, the duration of 
a measurement is kept very short which eases the task 

30 of the test person of not moving the head during meas- 
urement. The duration of a measurement is 1 .5 seconds 
which represents an optimum choice for signal to noise 
ratio and measurement duration. 
[0064] Further, the test person has preferably been 

35 monitored by visual means, such as video, to ensure 
that the position of the head of the test person has not 
been changed during the measurement of the HTF. 
[0065] If a movement of the head of the test person is 
detected during a measurement of the HTF, it has been 

40 preferred to discard such a measurement. 

[0066] To assist the test person in keeping his head in 
a fixed position during the measurement the test set-up 
included a video monitor so that the test person himself 
could monitor the position of the bad in order to keep the 

45 head in a correct position during measurement. 

[0067] Having measured the HTFs for a group of test 
persons and for a set of directions to a set of sound 
sources in relation to the test person it is now possible 
to construct an HTF (A) that for a given direction repre- 

so sents the measured HTFs corresponding to this direc- 
tion. 

[0068] One way of doing this is to select one of the 
HTFs measured as the HTF (A) after adjustment of the 
DC value to the range previously described. 
55 [0069] The selected HTF (A) should be the one that 
for most persons provide a sound experience of a high 
quality when the HTF (A) is used to reproduce sound, 
e.g. by means of play back of sound recordings through 
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filters with transfer functions that correspond to the 
selected HTFs (A), as described in more detail below. 
[0070] One aspect of the invention relates to an HTF 
(A) obtained from HTFs (B) obtained according to any of 
methods described above for at least two test objects, a 5 
test object being a person or an artificial head, by 
selecting an HTF which, when used in binaural synthe- 
sis, gives a sound impression which, when presented to 
a test panel, is found to give a high degree of conformity 
with real life listening to a sound source In the direction io 
in question. Such a test is described in greater detail in 
the following. 

[0071] Another related aspect of the invention is an 
HTF (A) obtained from HTFs (B) obtained according to 
any of methods described above for at least two test is 
objects, a test object being a person or an artificial 
head, by selecting an HTF which, when described 
objectively, e.g. in the frequency or the tirne domain, 
shows a high degree of similarity to individual HTFs of a 
population. Also this aspect is described in greater 20 
detail below. For a specific direction one criteria could 
be to select the HTF as the HTF (A) for which the sum 
of differences between the appertaining HTF and the 
other HTFs measured are minimal. The difference can 
be defined as the absolute value of the difference 25 
between two measured values of the corresponding 
HTFs or the squared value of the difference or any other 
function of the difference between two measured values 
of the corresponding HTFs. For a specific direction this 
means that for each HTF measured the difference 30 
between this HTF and each of the other HTFs of the set 
of HTFs measured is calculated for each time sample 
(or for each time sample of a selected subset of time 
samples) of the time domain representation of the HTFs 
or for each frequency sample (or for each frequency 35 
sample of a selected subset of frequency samples) of 
the frequency domain representation of the HTF are 
calculated and all the calculated differences are then 
added to form a resulting sum. When performing the 
summation weight factors can be multiplied to the caicu- 40 
lated values. Then the HTF with the least resulting sum 
is selected as the HTF (A). 

[0072] The representing HTF (A) can also be calcu- 
lated on the basis of the measured HTFs, for at least 
two test objects, a test object being a person or an arti- 45 
ficial head, by averaging, in the frequency domain, the 
amplitude of the HTFs (B), the amplitude averaging 
being performed, e.g., on pressure, power or logarith- 
mic basis, followed by minimum phase or zero phase 
construction to obtain an HTF. the averaging being so 
optionally followed by addition of a linear phase compo- 
nent giving an interaural time difference, the linear 
phase component or the interaural time difference suit- 
ably being obtained in a separate averaging of the linear 
phase components or the interaural time differences of 55 
the original HTFs (B). This method of constructing an 
HTF (A) is possible only because it has been found fea- 
sible, according to the present invention, to obtain 



measured HTFs which are very similar to each other. 
As a result of the fact that the deviations between HTFs 
according to the present invention are very low, it has 
become possible and relatively easy to recognize and 
utilize specific features of the HTFs, such as significant 
peaks and notches of the HlRs, amplitude peaks of the 
HTF, etc. Thus, an HTF (A) may be obtained from HTFs 
(B) for at least two test objects, a test object being a per- 
son or an artificial head, by averaging characteristic 
parameters of the HTFs (B), the characteristic parame- 
ters for instance being the frequency and the amplitude 
of characteristic points, e.g. peaks or notches, or the 
frequency of 3 dB points of peaks or notches, when the 
HTFs (B) are described in the frequency domain, or. the 
time and the amplitude of characteristic points, e.g. a 
characteristic positive peak or a characteristic negative 
peak, or the time of a characteristic zero crossing, when 
the HTFs are described in the time domain, or, the coor- 
dinates of, or the characteristic frequency and the Q- 
factor of poles and zeroes, when the HTFs are 
described in the complex s- or z-domain. 
[0073] A set of HTFs that represent the HTF (B)s 
measured for a set of directions to sound sources can 
be constructed according to the above described meth- 
ods in such a way that the methods chosen for the con- 
struction of HTFs (A) for different specific directions 
could be chosen to be identical or different as consid- 
ered advantageous for the actual application. 
[0074] Further, a set of HTFs (A) could be constructed 
as described above but where one subset of the HTFs 
(A) could be constructed from HTFs (B) measured on a 
group of test persons while other subsets of HTFs (A) 
could be constructed from HTFs (B) measured on differ- 
ent groups of test persons. 

[0075] An important aspect of the invention is an HTF 
(A) obtained from HTFs (B) for at least two test objects, 
a test object being a person or an artificial head, by 
averaging in the time domain or in the frequency domain 

a) the time-aligned HTFs (B), the time alignment 
being performed, e.g., by 

1) alignment to the onset of the pulse or to the 
first peak, or 

2) alignment to maximum cross-correlation, or 

b) the HTFs (B) from which the linear phase part 
and/or the all-pass phase part has been removed. 

the averaging being optionally followed by addition of a 
linear phase component giving an interaural time differ- 
ence, the linear phase components or the interaural 
time difference suitably being obtained in a separate 
averaging of the linear phase components or the inter- 
aural time differences of the original HTFs (B). The fre- 
quency axis, or a section or sections thereof, or the time 
axis, or a section or sections thereof, may have been 
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compressed or expanded individually for each HTF to 
reduce the differences between the HTFs before the 
averaging. 

[0076] A set of HTFs relating to at least two angles of 
sound incidence may consist of HTFs obtained accord- s 
ing to any of the above-described principles. The set 
may comprise HTFs (A) each of which has been individ- 
ually selected among HTFs, not necessarily among 
HTFs from the same origin, preferably using the real life 
listening selection method mentioned above. io 
[0077] The invention provides a number of specific 
high quality HTFs which are completely defined. Thus, 
the invention relates to an HTF (A) which is selected 
from the group consisting of the 97 HTFs shown in each 
of Fig. 1, Fig. 2 and Fig. 3. These HTFs, described as in is 
the figures, or in the form of tables, are extremely valua- 
ble commercial tools with hitherto unattainable quality, 
in any kind of technique where HTFs are used. 
[0078] The Invention also provides HTFs which are 
useful derivatives constructed on the basis of the above 20 
specific HTFs, namely HTFs obtained by interpolation 
between two or more of the 97 HTFs shown in each of 
Fig. 1 . Fig. 2 and Fig. 3, or HTFs which, when used for 
binauraLsynthesis gives an audible impression which is 
not clearly different from the impression given by an 25 
HTF (D) shown in any of the figures in question or 
obtained by interpolation therebetween. In this context, 
the term. "clearly different" means that a panel of inexpe- 
rienced listeners obtain a score of at least 90 per cent, 
preferably at least 80 and more preferably at least 70 30 
and most preferably at least 50, per cent correct 
answers when the two HTFs (A) and (D) are compared 
In a balanced four-alternative-forced-choice test, using 
programme material for which the HTFs are used or for 
which the HTFs are intended to be used. 35 
[0079] For any preferred HTF (A) according to the 
invention, 

a) the reference point of the HTF (B) or the HTFs 

(B) is at the entrance or close to the entrance, to the 40 
blocked ear canal, and the HTFs (B) have been 
obtained from a group of test persons that is repre- 
sentative for the group of users for whom the HTFs 
(A) are intended, and/or 

45 

b) the HTF (A) is one which, when used for binaural 
synthesis, gives an audible impression which Is not 
clearly different from the Impression given by an 
HTF (D) according to a). 

50 

[0080] An HTF or a set of HTFs as described herein 
may be adapted to an Individual listener or a group of 
listeners by modifying the interaural time difference of 
the HTF or the set of HTFs. the modification being 
based on 55 

a) the physical dimension of the listener or the lis- 
teners, such as head diameter, distance between 



the ears, etc., or 

b) a psychoacoustic experiment, where the HTF or 
the set of HTFs is used for binaural synthesis and 
the interaural time difference for each angle of a 
selected set of angles of sound Incidence is 
adjusted so that the sound impression as perceived 
by the individual listener or the group of listeners is 
found to give a high degree of conformity with real 
life listening to a sound source in the direction in 
question. 

[0081 ] Certain aspects of the invention relate to the 
construction of HTFs by approximation. These aspects 
are very valuable in many contexts, e.g. for small 
changes in positron or orientation of the head. Thus, in 
one aspect of the invention, an approximate HTF for an 
angle of sound incidence may be obtained by interpolat- 
ing HTFs corresponding to neighbouring angles of 
sound incidence, the interpolation being carried out as a 
weighted average of neighbouring HTFs, the averaging 
procedure preferably being performed as described 
above. In another aspect, an approximated HTF (A) can 
be made on the basis of a nearby HTF (B) by performing 
an adjustment of the linear phase of the HTF (B) to 
obtain substantially the interaural time difference per- 
taining to the angle of incidence for which the approxi- 
mated HTF (A) is intended. 

[0082] One aspect of the Invention relates to a method 
of obtaining an approximate HTF for a short distance 
between the listener and the sound source, comprising 

a) combining 

the left ear part of an HTF representing the 
geometric angle from the source position to the 
left ear position or optionally, if the left ear is not 
visible from the source position, the geometric 
angle from the source position tangentially to 
the part of the head obscuring the ear. With 

the right ear part of an HTF representing the 
geometric angle from the source position to the 
right ear position or optionally if the right ear is 
not visible from the source position, the geo- 
metric angle from the source position tangen- 
tially to the part of the head obscuring the ear, 

and/or 

Individually adjusting the level of the left ear and the 
right ear parts of the HTF. The individual adjust- 
ment of the level of the left ear and the right ear 
parts of the HTF may be performed in accordance 
with the distance law for spherical sound waves, 
using the geometrical distance to the middle of the 
head and the geometrical distance to each of the 
two ears or optionally, where an ear Is not visible 
from the source position, the geometrical distance 
to the tangent point of the part of the head obscur- 
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ing the ear or to the ear passing the tangent point 

* and following the curvature of the head. 

[0083] As described above, one of the applications of 
the HTF (A) is to use a set of HTFs (A) as a design tar- s 
get for signal processing means, such as a set of digital 
filter pairs, used to simulate the transmission of sound 
from a set of (fictive) sound sources to the left and right 
ears of the listener. The transer functions of the set of 
digital filter pairs are designed to correspond to the io 
appertaining HTFs (A). A binaural signal Is generated 
by filtering a set of sound signals corresponding to the 
set of (fictive) sound sources with the set of digital filter 
pairs. 

[0084] Thus, an HTF may be obtained from the above is 
HTFs according to the invention by further processing, 
such as filtering, equalizing, delaying, modelling, or any 
other processing that maintains the information con- 
tents inherent in the original HTF or set of HTFs, the 
said further processing being substantially identical for 20 
the left and right ear parts of the HTF, or for a set of 
HTFs corresponding to different angles of sound inci- 
dence being substantially identical for the different 
directions but not necessarily identical for the left and 
the. right ear parts of the HTFs. 25 
[0085] Examples of such signal processing which are 
useful in various applications are signal processings 
which have been performed so that 

.a) the HTF of a specific angle, e.g. in the frontal 30 

• plane, has a flat frequency response, or 

b) the amplitude of a binaural signal formed by bin- 
aural synthesis of a diffuse sound field is substan- 
tially identical to the amplitude of the diffuse sound 3S 

. ' field itself, or 

c) the amplitude of a binaural signal formed by bin- 
<aural synthesis of a specific sound field is substan- 

- vtially identical to the amplitude of the sound field at 40 
. ,ithe pi reference point, 

[0086] In some practical uses of the method of the 
invention, e.g., mixing consoles, at least two sound 
inputs (1) are combined into one sound input (2) which 45 
is filtered with one set of two filters simulating an HTF 
Typically, the sound inputs (1) which are con±»ined are 
sound inputs belonging together in spatial groups, such 
as "from the front", "from behind", "from the right side", 
"from the left side", etc.. in relation to the listener. so 
[Q087] An important use of the binaural synthesis 
method of the invention is for simulation of a sound field 
of a specific environment, such as a room. e.g. a con- 
cert hall, wherein transmission of sound from a set of 
sound sources with specific positions in said environ- 55 
ment to a receiving point with a specific position In said 
environment is simulated by 



a) forming, for each of a number of transmission 
paths for each sound source, a binaural signal (A), 
and 

b) combining the binaural signals (A) for each 
sound source into a binaural signal (B), and 

c) combining the binaural signals (B) of the set of 
sound sources into a resulting binaural signal (C). 

[0088] Another important utilization of the invention is 
for noise measurement and/or assessment of the effect 
of noise, or any other measurement and/or simulation 
where a description of a sound transmission is involved, 
in which binaural signals produced according as dis- 
cussed herein and/or HTFs as characterized herein are 
utilized to increase the generality. 
[0089] For some uses of the invention, including, e.g. , 
virtual reality applications or teleconferencing, it is use- 
ful to sense position and/or orientation, and/or changes 
in position and/or orientation, of the head of a listener 
and modify the electronic signal processing in depend- 
ence of the sensed position and/or orientation and/or 
changes in position and/or orientation. This could, e.g.. 
be used to give the impression that the virtual sources 
remain in position irrespective of head movements. 
[0090] The sensing of the position and/or orientation, 
and/or changes in position and/or orientation, of the 
head of a listener, may be performed by 

a) transmitting at least one pulse of energy, such as 
a ultrasonic wave pulse or an infrared light pulse, 
adapted to be received by one or more receiving 
means mounted at and following the movements of 
the head of the listener, 

b) detecting the arrival time or each of the arrival 
times of the transmitted energy pulse or pulses at 
the receiving means or each of the receiving means 
and optionally detecting or recording the time of 
transmission or each of the times of transmission 
from the corresponding transmitter or transmitters, 
and 

c) calculating the position and/or orientation of the 
head of the listener based on the detected arrival 
time or times and optionally on the detected or 
recorded time or times of transmissions. 

[0091] The signal processing in the method of the 
invention can. if desired, additionally include compensa- 
tion of transfer characteristics of a signal -to-sound 
transducer, such as its frequency dependent sensitivity, 
impedance relations, etc.. thereby approaching the per- 
ception of an ideal signal-to-sound transducer. Further, 
the characteristics of the transmission of sound from the 
signal-to-sound transducer to a specific point, e.g. to a 
specific point in the ear canal of a listener, could be 
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included in the compensation. On the other hand, trmny 
sound reproductions which are perceived as pleasant or 
interesting do in fact include transfer characteristics or 
coloration of loudspeakers, or sound modifications char- 
acteristic of the room in which the loudspeakers are 5 
arranged, and thus, another interesting possibility Is to 
supplement the binaural signal with echoes and/or 
reverberation and/or coloration to simulate a non*uni- 
form signal response of the virtual signal-to-sound 
transducers and/or to simulate that the virtual signal-to- w 
sound transducers are arranged in an imaginary room. 
These additional signals may or may not be coded with 
directional and/or distance information about their vir- 
tual sound sources. 

[0092] As indicated above, the signal processing may 15 
additionally include compensation for the difference in 
pressure division at the input to the ear canal when the 
ear is occluded, respectively unoccluded. by a head- 
phone. A way of obtaining a description of the difference 
in pressure division at the input to the ear canal when 20 
the ear is occluded, respectively unoccluded. by a 
headphone, comprises measuring the transmission 
from the headphone to the sound pressure 

at the entrance, or close to the entrance, of the 25 
blocked ear canal, and 

at the entrance, or close to the entrance, of the 
open ear canal. 

the ratio of the frequency domain descriptions of these 30 
transmissions being obtained as characteristic of the 
pressure division (X) in this situation, 
and 

measuring the transmission from a sound source that 
does not influence the acoustic radiation impedance of 35 
the ear, to the sound pressure 

at the entrance, or close to the entrance, of the 
blocked ear canal, and 

at the entrance, or dose to the entrance, of the 40 
open ear canal. 

the ratio of the frequency domain descriptions of these 
transmissions being obtained as characteristic of the 
pressure division (Y) in this situation. 45 
and obtaining the ratio X/Y which constitutes the fre- 
quency domain description of the difference in pressure 
division. 

[0093] Any compensation for signal-to-sound trans- 
ducers such as headphones and loudspeakers may be so 
adapted to the individual listener, by determining the 
appropriate transfer characteristics for the individual 
user. 

[0094] The signals subjected to the signal processing 
described above could be signals which are adapted to ss 
be decoded into sound representing signals, e.g. broad- 
cast signals, by decoding them in the manner corre- 
sponding to the coding scheme of the appropriate 



sound reproducing system and then processing them 
into a binaural signal as described above. Whether or 
not a particular broadcast signal is adapted to be 
decoded in a particular system can easily be assessed 
by providing the signal to a decoder pertaining to the 
system and analyse the decoded signals. 
[0095] Headphones constitute preferred signal-to- 
sound transducers for the binaural signal. In the present 
context, the term headphones includes conventional 
headphones and any other sets of two portable signal- 
to-sound transducer units adapted to be placed on a 
human adjacent or close to the ears of the human, 
[0096] Especially attractive headphones for use in the 
method of the invention could be wireless headphones 
adapted for any kind of wireless transmission of the bin- 
aural signal, such as electromagnetic, optical, infrared, 
ultrasonic, etc. 

[0097] The binaural signal is normally adapted to be 
emitted by means of headphones, but it is within the 
scope of the invention to reproduce the signal by means 
of two loudspeakers. When loudspeakers are used, 
crosstalk of the loudspeakers may. if desired, be coun- 
teracted by supplementing the binaural signal with arti- 
ficial crosstalk, which may either be incorporated in the 
binaural signal or consist of additional electrical signals. 
Crosstalk is caused by the fact that the let ear is able to 
hear the right loudspeaker and vice-versa in contrast to 
the headphones. 

[0098] When two loudspeakers are used to reproduce 
the sound corresponding to the binaural signal the posi- 
tion of the listener in relation to these loudspeakers is 
rather critical because of the cross-talk phenomena. 
However, by sensing the position of the head of the lis- 
tener and modifying the electronic signal processing in 
response to the sensing, it will be possible to compen- 
sate the cross-talk in accordance with the position.of the 
head of the listener, thereby dramatically improving the 
quality of the listening experience. 
[0099] Both in the cases where headphones are used 
and in the cases where two loudspeakers are used, the 
position and/or orientation, and/or changes in position 
and/or orientation, of the head of a listener can, as indi- 
cated above, be sensed by means of suitable sensing 
means, and the electronic signal processing can be 
modified in dependence of the sensed position and/or 
orientation and/or changes in position and/or orienta- 
tion. The effects aimed at in the modification may range 
from minor corrections or adjustments which are desira- 
ble in connection with head movements when listening 
to binaural sound reproduction, to modifications 
adapted to impart to the listener the perception that the 
virtual sound sources remain in position irrespective of 
the position and/or orientation, and/or changes in posi- 
tion and/or orientation, of the listener's head, or even 
modifications where special artificial effects are aimed 
at. such as a perception that the virtual spatial sound 
f ield continues to turn a little due to "inertia" after the lis- 
tener has stopped a turn of the head. As will be under- 
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Stood by a person skilled in the art. such modifications 
of the electronic processing are possible in particular 
where the HTFs are Implemented by digital filters, such 
as is described in detail in the following. 
[01 00] One way of sensing the parameters of the posi- 
tion and orientation of the listener mentioned above is to 
apply a known varying magnetic field to the surround- 
ings of the listener and applying a set of crossing coils 
to the head of the listener. When the magnetic field 
applied to the listening room is known it is possible to 
derive the position and orientation of the listener's head 
from the voltages generated in the crossing sensing 
coils. Analogous methods could be used for other kinds 
of fields^ such as ultrasonic fields, applied to the listen- 
ing room, with appropriate detectors applied to the lis- 
tener's head, or equipment based on video cameras 
coupled to image recognition means could be utilized. 
[0101] Other aspects of the invention relates to appli- 
cations of the HTFs used for binaural synthesis utilizing 
the generality aspect of these HTFs for example in 
designing artificial heads, In designing frequency 
response of headphones, in computer models of the 
human binaural sound localization or perception in gen- 
eral, etc. 

[0102] In accordance with what is discussed above, 
an embodiment of the invention comprises transmitting 
the binaural signals in the form of modulated ultrasonic 
waves, the waves being received by a listener equipped 
with two receiving means each of which Is mounted 
close to the appertaining ear of the listener, changes in 
orientation of the listener's head relative to a reference 
orientation being compensated on the basis of the dif- 
ference of the travel time of the ultrasonic wave pulses 
between the two receiving means so that the listener 
will perceive that virtual sound sources remain in a ref- 
erence position irrespective of the orientation of the lis- 
tener's head, the compensation being automatic or 
carried out by involving electronic signal processing. 
[0103] For a number of practical uses, such as in air 
traffic control, in control of cabs or trucks, in messenger 
offices, in life saving stations, in central offices of watch- 
men, in telephone meetings, in meetings using audio- 
visual communication means, etc., the method of the 
present Invention can be applied for communication, 
comprising transforming, by signal processing means, 

signals (At .-An) of at least one single channel com- 
munication system and/or at least one multichannel 
communication system which signals are adapted 
for being supplied to at least one signal-to-sound 
transducer, or 

signals which are adapted for being decoded into 
such signals (A^.-A^) 

into a binaural signal (C). so that the binaural signal, 
when reproduced, is capable of imparting to a receiver 
of the communication a perception of listening to a spa- 
tial sound field with a set of n individually positioned vir- 



tual sound sources, each of which transmits one of the 
signals (Ai..An). 

[01 04] In connection with this, a valuable embodiment 
is where the position and orientation of the receiver's 

5 head is monitored, and head position and head orienta- 
tion data obtained in the monitoring Is used to enable 
the receiver to selectively transmit a message to one of 
the transmitters corresponding to one of the signals 
(A^ .. ApJ by turning his head In the direction of the virtual 

10 sound source corresponding to said transmitter. 

[01 05] A special utilization of the method of the inven- 
tion is for multichannel sound reproduction, e.g., Dolby 
Surround. Stereo, Quadrophony, or any HDTV mul- 
tichannel specification, comprising transforming, by sig- 

15 nal processing means, 

signals (Ai..An) of a multichannel sound reproduc- 
ing system which signals are adapted for being 
supplied to n different signal-to-sound transducers 
20 of the multichannel sound reproducing system, or 
signals which are adapted for being decoded into 
such signals (A^..An) 

Into a binaural signal (C) by the method of the invention 

25 SO that the binaural signal, when reproduced, is capable 
of imparting to a listener a perception of listening to a 
spatial sound field similar to the sound field which would 
have resulted from listening to the n signal-to-sound 
transducers spatially arranged In a room. 

30 [01 06] A range of uses of the method of the invention 
are related to the situations where the binaural signals 
are used for positioning a set of sounds at specific vir- 
tual positions in relation to an operator, such as, e.g., 
operators of industrial processes, pilots and astronauts, 

35 flight controllers, video game players, users of interac- 
tive TV, surgeons operating patients, etc. 
[01 07] One example of this is where a moving virtual 
sound source with a characteristic sound moves contin- 
uously or discontinuously between specific positions of 

40 a set of virtual sound sources, the operator being ena- 
bled to communicate a specific message to the system 
according to a particular virtual sound source by 
prompting the system when the moving virtual sound 
source is positioned substantially at the position of said 

45 virtual sound source. The position of the moving virtual 
sound source may be controlled by the operator, and/or 
by the orientation and/or position of the head of the 
operator, and/or the positions may be dynamically con- 
trolled by a computer in accordance with a set of rules 

50 or a predefined scheme. 

[0108] One application hereof is in guidance of the 
movement of an object, such as a robot, or a person, 
such as a blind person, where the method is used for 
controlling or assisting the movement and/or position of 

55 an object and/or a living being by dynamically position- 
ing a virtual sound source in relation to the object and/or 
living being, so as to guide the object and/or the living 
being in relation to the position of the virtual sound 
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source. 

[0109] In any embodiment of the invention, the binau- 
ral signal may, of course, be stored on an audio storage 
medium or broadcast. As a special feature, each sound 
input (2) representing a combination of more than one s 
sound inputs (1) may be stored or broadcast separately, 
such as in a separate track or in a separate channel, 
respectively, the binaural filtering being carried out 
before or after storing or broadcasting. 
[01 10] A number of aspects of the invention comprise io 
the use of HTFs of the generality obtained according to 
the present invention in computer modelling or analys- 
ing the cerebral human binaural sound localization abil- 
ity. 

[0111] Another such aspect comprises a method for is 
designing headphones, wherein adapting the transfer 
characteristics of the headphones are adapted to 
resemble an HTF characterized according to the inven- 
tion for a given direction, e.g., the frontal direction, or to 
resemble weighted averages of such HTFs correspond- 20 
ing to averages of given directions. 
[01 12] A further such aspect relates to a artificial head 
having HTFs which correspond substantially to HTFs 
determined according the invention for all angles of 
sound incidence, or at least for angles of sound inci- 25 
dence which constitute part of the total sphere sur- 
rounding the artificial head, such as the upper 
hemisphere or the frontal region. This can be done by 
adapting the geometric characteristics of the artificial 
head and/or the acoustic properties of the materials 30 
used so as to approximate the HTFs of the artificial 
head to HTFs according to the invention for all angles of 
sound incidence, or at least for angles of sound inci- 
dence which constitute part of the total sphere sur- 
rounding the artificial had, such as the upper ss 
hemisphere or the frontal region. 
(01 1 3] In the following, the Invention will be described 
In more detail, by way of example, with reference to the 
accompanying drawings, in which: 

40 

Fig. 1 (l)-(6) shows the time domain 

description of a set of HTFs (1) 
of a specific person according to 
the Invention, and (7)-(12) 
shows the frequency domain 45 
description of the HTFs (1), 

Fig. 2 (1)-(6) shows the time domain 

description of a set of HTFs (2) 
according to the invention, so 
obtained as an average across 
HTFs for 40 persons, by averag- 
ing the minimum phase approxi- 
mation in decibels frequency by 
frequency, followed by the addi- ss 
tion of the average linear phase 
parts of the HTFs and, (7)-(12) 
shows the frequency domain 



description of the HTFs (2), 

Fig. 3 (1)-(6) shows the time domain 

description of a set of HTFs (3) 
according to the invention, 
obtained as an average across 
40 persons, by averaging the 
time aligned time domain repre- 
sentations of the HTFs sample 
by sample, followed by the addi- 
tion of the average delays of the 
HTFs, and (7)-(l2) shows the 
frequency domain description of 
the HTFs (3), 

Fig. 4 is a photo of a miniature micro- 

phone mounted in the ear of a 
test person to measure the pres- 
sure (P2) at the blocked ear 
canal. 

Fig. 5 shows the placement of a micro- 

phone at the blocked entrance to 
an ear canal. 

Fig. 6 is a photo of the measurement 

set-up in anechoic chambersfor 
measurement of an HTF. 

Fig. 7 shows graphs of the frequency 

domain representation and the 
time domain representation of a 
specific HTF for one test person. 

Fig. 8 shows the standard deviation of 

the gain of HTFs for different 
groups of test persons for com- 
parison of measurements per- 
formed according to the present 
invention with measurements 
performed according to prior art. 

Fig. 9 shows an example of a Head- 

related Impulse Response, 

Fig. 10 shows the frequency domain 

representation of the Head- 
related Impulse Response of 
Fig. 9 truncated to different 
lengths. 

Fig. 1 1 shows an example of a Head- 

related Impulse Response 
adjusted for different DC values, 

Fig. 12 as Fig. 11 but for the frequent 

domain representations. 
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Fig. 13 

Fig. 14 
Fig. 15 

Fig. 16 



Fig. 17 



Fig. 18 



Fig. 19 



Fig. 20 
Fig. 21 

Figs. 22. 23 and 24 



shows an example of averaging 
the time domain representations 
of a set of HTFs. 

as Fig. 13, but for the frequency s 
domain representations. 

shows an example of logarithmic 
averaging the frequency domain 
representations of a set of HTFs, io 



shows an example of a minimum 
phase representation and an 
example of a zero phase repre- 
sentation of an averaged set of is 
Head-related Impulse 
Responses, 

shows an example of averaging 
the time domain representations 20 
of a set of HTFs after time align- 
ment, 

as Fig. 17, but for the frequency 
domain representations of the 25 
HTFs. 



shows an example of interpola- 
tion of the time domain repre- 
sentations of the HTFs to create 
a new HTF corresponding to a 
direction that is in between four 
directions corresponding to four 
known HTFs, 
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as Fig. 19, but for the frequency 
domain representations. 

(a)-(d) shows an example of 
obtaining an approximate HTF 40 
for a short distance between the 
listener and the sound source. 

show standard deviations of the 
amplitude, in dB, between sub- 45 
jects, in the frequency interval 
between 100 Hz and 8 kHz, for 
single frequencies and 1/3 
octave noise bands. 



[0114] Rgs. 1-3 show three different sets of HTFs 
obtained by different methods according to the present 
invention, one In each figure. In each the figures, the 
descriptions of the HTFs are characterized by their 
angle of incidence, stated as (azimuth.elevation). In 
each of time domain descriptions, the upper curve per- 
tains to the left ear, and the lower curve pertains to the 
right ear. In each of the frequency domain descriptions, 



50 



55 



the thick line curve pertains to the left ear. and the thin 
curve pertains to the right ear. The "tag" at each side of 
the frequency domain curves represents 0 dB. 
[0115] The HTFs shown in Figs. 1-3 are examples of 
HTFs according to the current invention, the HTFs of 
Fig. 1 being a single person's HTFs. whereas the HTFs 
of Fig. 1 and Fig. 2 are averages across a large number 
of persons, and have been obtained according aspects 
of Invention. The average HTFs of Fig. 2 has been 
obtained as an average across HTFs for 40 persons, by 
averaging the minimum phase approximation in deci- 
bels frequency by frequency, followed by the addition of 
the average linear phase parts of the HTFs. The HTFs 
of Rg. 3 has been obtained as an average across 40 
persons, by averaging the time aligned time domain 
representations of the HTFs sample by sample, fol- 
lowed by the addition of the average delays of the HTFs. 
[01 1 6] Fig. 6 shows a set-up for a measurement of the 
HTFs according to the present invention performed in 
an anechoic chamber. A known signal is sent to a loud- 
speaker positioned in the direction corresponding to the 
HTF to be measured. A miniature microphone of the 
type Sennhelser KE 4-211-2 is placed at each of the 
blocked entrances to the ear canals of the test person 
as shown In Fig. 4 and Fig. 5. 

[0117] The KE 4-21 1-2 is a pressure microphone of 
the back electret type, and it has a built-in FET amplifier. 
The microphone itself has a sensitivity of approximately 
1 0 mV/Pa. Coupled with a gain as suggested in the data 
sheet, the sensitivity increases to approximately 35 
mV/Pa. A small battery box was used, and in order to 
Increase the output signal and to reduce the output 
Inrpedance. a 20 dB amplifier was built into the same 
box. Two selected microphones were used throughout 
the experiment, one for each ear. 
[0118] The reference sound pressure p^ from the 
loudspeaker was measured with each of the miniature 
microphones. The microphone was placed at the posi- 
tion where the middle of the test person's head would 
be during measurement. In order to disturb the field as 
little as possible, the microphones were fixed by a thin 
wire and with an orientation giving 90* Incidence of the 
soundwave from the loudspeaker. In this way, the p^ 
measurement was minimally influenced by the pres- 
ence of the microphone in the sound field. 
[01 1 9] During measurement of the sound pressure P2 
at the entrance to the blocked ear canal, the micro- 
phone was mounted in an EAR earplug placed in the 
ear canal. The microphone was inserted in a hole in the 
earplug, and then the soft material of the earplug was 
compressed during insertion in the ear canal. As the 
earplug relied, the outer end of the ear canal was com- 
pletely filled out. The end of the earplug and the micro- 
phone were mounted flush with the ear canal entrance 
(see Fig. 4 and Fig. 5). 

[01 20] The measurements were carried out in an ane- 
choic chamber with a free space between the wedges of 
6.2 m (length) by 5.0 m (width) by 5.8 m (height). The 
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test person was standing on a platform in a natural 
upright position, and a small backrest mounted on the 
platform helped the test person to stand still. 
[0121] To assist in the control of horizontal position 
and orientation of the test persons head, the test person s 
had a paper marker on top of the head. This marker was 
observed through a video camera placed right in front of 
the test person and shown on a moveable monitor to the 
test person. Using this, the test person could correct 
position and azimuth. io 
[0122] The operators had a similar monitoring for 
observation of the test persons exact position and for 
controlling that the test person did not move during each 
single measurement. If movements were observed, the 
measurement was discarded and redone. is 
[0123] The loudspeakers used were 7 cm membrane 
diameter midrange unit (Vlfa M10MD-39) mounted in 
15.5 cm diameter hard plastic balls. 
[01 24] The general purpose measuring system known 
as MLSSA (Maximum Length Sequence System Ana- 20 
lyzer) was used. Maximum length sequences are binary 
two level pseudo-random sequences. The basic idea of 
MLS technique is to apply an analogue version of the 
sequence to the linear system under test, sample the 
resulting response, and then determine the system 25 
impulse response by cross-correlation of the sampled 
response with the original sequence. 
[0125] The above method of performing measure- 
ments using maximum length sequences offers a 
number of advantages compared to traditional fre- 30 
quency and time domain techniques. The method is 
basically noise immune, and combined with averaging, 
the achieved signal to noise ratio is high. . A thorough 
review of the MLS method is given by Rife and Vanderk- 
ooy: "Transfer-function measurement with maximum- 35 
length sequences", Journal of the Audio Engineering 
Society, vol. 37, no. 6. 

[0126] For the purpose of measuring at both ears 
simultaneously, two MLSSA systems were used, cou- 
pled in a master-slave configuration by a purpose made 40 
synchronization unit allowing sample synchronous 
measurements. 

[0127] The 4 V peak-to-peak stimulus signal from the 
master MLSSA board was sent to the power amplifier 
(Pioneer A-616) that was modified to have a calibrated 45 
gain of 0.0 dB. From the output it was directed through 
a switch-box to the loudspeaker in the measurement 
direction. The free field sound had a level of 75 dB(A) at 
the test persons position, a level where the stapedius 
was assumed to be relaxed. so 
[0128] From the microphone the signal was sent 
through a measuring amplifier, B&K 2607. 
[0129] The sampling frequency of 48 kHz was pro- 
vided by an external clock. To avoid frequency aliasing, 
the 20 kHz Chebyshev low pass filter of the MLSSA 55 
board and the 22.5 kHz low pass filter of the measuring 
amplifier were used. Also the 22.5 Hz high pass filter on 
the measuring amplifier was active. 



[0130] Preliminary measurements on the free field 
setup using the maximum MLS length offered by 
MLSSA. 65535 points, showed that a length of 4095 
points was sufficient to avoid time aliasing. In order to 
achieve a high signal to noise ratio, the recording was 
averaged 16 times, called pre-averaging in the MLSSA 
system. Even with this averaging the total time for a 
measurement was as short as 1.45 seconds. During 
this period the test persons were normally able to stand 
still. All measured impulse responses were very short, 
and only the first 768 samples of each impulse 
response, corresponding to 16 milliseconds, were com- 
puted and saved. 

[0131] Results of the measurements were impulse 
responses for the transmission from input to the power 
amplifier to output of the measuring amplifier. The post 
processing needed to obtain the wanted information 
was carried out in MATLAB. 

(01 32] The measured impulse responses all included 
an initial delay, corresponding to the propagation time 
from the loudspeaker to the measuring point (approxi- 
mately 6 milliseconds). All responses were very short, 
duration only a few milliseconds, therefore, only sam- 
ples from 256 through 511 were processed (time from 
5.33 ms to 10,65 ms) The restriction to this time win- 
dow eliminated reflections from the monitor in the ane- 
choic chamber. 

[0133] For determination of the HTF {P2/P1) the 
selected portion of the p-i and pa impulse responses 
were Fourier transformed, and a complex division was 
carried out in the frequency domain. As the same equip- 
ment was involved during measurement of p^ and P2, 
the influence of equipment cancels out in the division. 
[01 34] If it is desirable to simulate the HTF using ana- 
log filters, then the frequency domain representation of 
the HTF can form the basis for the synthesis of analog 
implementations of the filters as described in any. text 
book on filter synthesis. 

[01 35] The impulse response of the HTF was deter- 
mined through an inverse Fourier transform of P2/Pi- 
Before the transformation, P2/P1 was filtered by a 4'th 
order Butterworth filter (bllinearly transformed) In order 
to prevent from frequency aliasing. 
[01 36] If its desirable to simulate the HTF using digital 
technique, then the Head-related Impulse Responses 
can be digitised and stored in the storage(s) of the dig- 
ital implementations of the filters. 
[01 37] An example of the frequency domain represen- 
tation and the time domain representation of a specific 
HTF for one test person is shown in Fig. 7. To benefit 
from these advantageous HTFs it is important to under- 
stand that the signal to sound transducer, such as head- 
phones, has to be calibrated correctly. 
[0138] As already mentioned the entrance to the 
blocked ear canal has been chosen as the measure- 
ment point because the individual differences between 
HTFs of different test persons have been found to be 
very low among other things because of this choice. It 
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has been shown that a major part of the differences 
between individual HTFs are added by the transmission 
of the sound pressures through the individual ear 
canals. Thus, it is important to be able to reproduce the 
sound pressures, e.g. by headphones, at the reference 
point of the measurement at the entrance to the blocked 
ear canal without adding any individual differences to 
the sound pressures. This means that the transfer func- 
tion describing the characteristics of transmission of a 
sound signal from the terminals of the headphones to 
the reference point at the blocked ear canal must have a 
flat frequency response so that the frequency domain 
representations of the HTFs will not be distorted. 
[0139] Further, the headphone must be open, as 
defined in the above mentioned tutorial by Henrik 
Moller, or which is equilvalent to having a free field 
equilvalent coupling to the ear as it has later been 
denoted, so that the impedance looked out into from the 
ear is not changed when the headphone is applied to 
the ear, or alternatively the headphones should be 
adjusted to compensate for its transmission impedance. 
[01 40] Fig. 8 shows the standard deviation of the gain 
of HTFs for different groups of test persons for compar- 
ison of measurements performed according to the 
present invention with measurements performed 
according to prior art. The graphs of Fig. 8 is based on 
measurements of the HTFs of a significant number of 
test persons- The prior art measurements are disclosed 
in: RL. Wightman and D. Krstler, "Headphone Simula- 
tion of Free-Field Listening, I: Stimulus Synthesis, II: 
Psychoacoustical Validation," J. Acoust. Soc. Am. 
85(2). 858-878, 1989 and in: PA. Hellstrom and A. 
Axelsson, "Miniature microphone probe tube measure- 
ments in the external auditory canal", J. Acoust. Soc. 
Am. 93(2), 907-919, 1993. The graphs show the stand- 
ard deviation of the gain as a function of frequency aver- 
aged for all directions in 1/3 octave bands. It is seen that 
the present invention provides an improvement by 
approximately a factor of 2 over the known methods, 
and thereby provides a significant improvement com- 
pared to prior art techniques. 

[0141] Fig. 9 shows a typical example of a Head- 
related Impulse Response. Different lengths of this 
impulse response (starting from t = 0 in Fig. 9) are Fou- 
rier transformed and the results are shown in Fig. 10. 
The DC adjustment described below are performed 
before each Fourier transformation after truncation of 
the impulse response. It is seen from Fig. 10 that no sig- 
nificant changes in the frequency domain representa- 
tion of the impulse response occur for impulses longer 
than 1 ms. As explained earlier, when evaluating the 
duration of the part of the Head-related Impulse 
Responses used in the simulation, it is important to 
study its frequency response. Examples are reported 
where an apparently short impulse can not be truncated 
to a few milliseconds as the truncation changes its fre- 
quency response to an unacceptable extent because 
the impulse contain essential information over a longer 



time duration. Fig. 9 and 10 illustrates that this is not 
true for the Impulses of the present invention. 
[0142] As mentioned before, until the present inven- 
tion, the value at zero Hz of the frequency domain rep- 
resentation of the HTF (the DC value of the HTF) seems 
to have attracted little or no attention in the art. How- 
ever,, the research and development of the present 
inventors has revealed that the DC value has a signifi- 
cant influence on the frequency domain representation 
of the HTF thereby influencing the sound quality, such 
as coloration, when the HTF Is used in sound reproduc- 
tion. Fig. 11 shows an example of a Head-related 
Impulse Response adjusted for different DC values and 
Fig. 12 shows the corresponding frequency domain rep- 
15 resentations. It is interesting to note that the Influence 
on the time domain representations of the HTFs are 
barely seen while simultaneously the influence in the 
frequency domain representations are significant 
[0143] Fig. 13 shows the time domain representations 
20 Of the HTFs of a specific direction for one ear for a group 
of test persons and also the average value of these 
HTFs is shown (in this context the term averaging 
means the averaging of any function of the pressures 
measured, such as the pressure itself or the logarithmic 
25 pressure, or p^ (the power average), etc.). 

[0144] Fig. 14 shows the gain of the corresponding 
frequency domain representations of the HTFs of Fig. 
13 and also the average gain is indicated. 
[0145] Fig. 15 shows the gain of the HTFs shown in 
30 Fig. 14 but with the logarithmic average also shown. It 
will be noted that the logarithmic average seems to rep- 
resent the group of HTFs better than the average shown 
in Fig. 14. 

[0146] In Fig. 1 4 and Fig. 1 5 only the gain is averaged 
55 which leaves the phase to be defined. Several possibili- 
ties exist. Fig. 16 shows the time domain representation 
of the averaged HTFs with the minimum phase added 
and also the corresponding average with a zero phase 
Is shown. 

0 [01 47] Fig. 1 7 and Fig. 1 8 shows the time domain rep- 
resentations and the frequency domain representations 
of the HTFs of a specific direction for one ear for a group 
of test persons and also the average value of these 
HTFs is shown but after time alignment. The time align- 
5 ment being performed, as the name indicates, in the 
time domain, e.g.. by alignment to the onset of the 
pulses or alignment to the first peak, or alignment to 
maximum cross-correlation. In Fig. 17 and Fig. 18 the 
impulses are aligned to the onset of the impulses. It will 
so be seen that the averages provided this way seem to 
reproduce more features of the HTFs than the averages 
without the time alignment. 

[0148] The time alignment can be performed for the 
transfer functions of both ears together or independ- 
55 ently for the transfer functions of each ear 

[0149] After time alignment and averaging a linear 
phase is added to the averaged functions to account for 
the interaural time difference. The linear phase contribu- 
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tion to the function is calculated on the basis of the 
measured appertaining HTFs, such as the average of 
the linear phase contributions of all the HTFs. 
[0150] Yet another way of averaging the HTFs of a 
specific direction is to perform a sort of a parametric 5 
averaging by aligning the time domain representations 
according to significant features, e.g. aligning peaks 
and valleys of the HTFs either in the time domain or in 
the frequency domain including stretching or compress- 
ing the X-axis (time or frequency) in between peaks and io 
valleys, followed by a averaging of the resulting func- 
tions and followed by the addition of the calculated, e.g. 
averaged phase contribution. 

[01 51 ] tn many applications, e.g. In virtual reality appli- 
cations, it is desirable to be able to simulate a huge is 
number of HTFs. According to the invention it is possi- 
ble to simulate HTFs from a set of specific HTFs using 
interpolation. 

[0152] For example an HTF corresponding to a spe- 
cific direction that lies in between the directions cor re- 20 
spending to four known HTFs could be calculated 
according to any of the calculation methods described 
above in the sections concerning averaging techniques. 
Fig. 19 and Rg. 20 shows examples of this In the time 
domain and in the frequency domain. 25 
[0153] In Fig. 22. Fig. 23 and Fig. 24 Group I angles 
designate angles above horizontal plane and at the 
same side as the ear (including the horizontal plane and 
the median), and Group II angles designate the remain- 
ing angles. 30 

Claims 

1 . A method of generating binaural signals by filtering 

at least one sound input with at least one set of two 35 
filters, each set of two filters having been designed 
so that the two fitters simulate the left ear and the 
right ear parts of a Head-related Transfer Function 
(HTF). 

characterized in that 40 
the duration of the time domain representation of 
the transfer function of the filters simulating the HTF 
is at the most 2 ms. 

2. A method according to claim 1. wherein the dura- 45 
tion of the time domain representation of the trans- 
fer function of the fitters simulating the HTF is at the 
most 1 .5 ms. 

3. A method according to claim 2, wherein the dura- so 
tion of the time domain representation of the trans- 
fer function of the filters simulating the HTF is at the 
most 1 .2 ms. 

4. A method according to claim 3. wherein the dura- 55 
tion of the time domain representation of the trans- 
fer function of the filters simulating the HTF is at the 
most 1 nris. 



5. A method according to claim 4. wherein the dura- 
tion of the time domain representation of the trans- 
fer function of the filters simulating the HTF is at the 
most 0.9 ms. 

6. A method according to claim 5, wherein the dura- 
tion of the time domain representation of the trans- 
fer function of the filters simulating the HTF is at the 
most 0.75 ms. 

7. A method according to claim 6, wherein the dura- 
tion of the time domain representation of the trans- 
fer function of the filters simulating the HTF is at the 
most 0,5 ms. 

8. A method according to any of the preceding claims, 
wherein the HTF is used generally for a population 
of humans for which the binaural sigrtals are 
intended, the HTF being determined in such a man- 
ner that the standard deviation of the amplitude, in 
dB, between subjects, over at least a major part of 
the frequency interval between 1 kHz and 8 kHz Is 
at the most as shown in Fig. 22 for at least one of 
the curves thereof. 

9- A method according to claim 8. wherein the HTF 
has been determined in such a manner that the 
standard deviation of the amplitude, in dB, between 
subjects, over at least a major part of the frequency 
interval between 1 kHz and 8 kHz is at the most as 
shown in Fig. 23 for at least one of the curves 
thereof. 

10. A method according to claim 9, wherein the HTF 
has been determined in such a manner that the 
standard deviation of the amplitude, in dB. between 
subjects, over at least a major part of the frequency 
Interval t^etween 1 kHz and 8 kHz is at the most as 
shown in Fig. 24 for at least one of the curves 
thereof. 

1 1. A method according to any of the preceding claims, 
wherein the value at zero Hertz of the frequency 
domain description of the transfer function of the fil- 
ters simulating the HTF Is In the range from 0.316 to 
3.16. 

12. A method according to claim 1 1 . wherein the value 
at zero Hertz of the frequency domain description 
of the transfer function of the filters simulating the 
HTF is in the range from 0.5 to 2. 

13. A method according to claim 12, wherein the value 
at zero Hertz of the frequency domain description 
of the transfer function of the filters simulating the 
HTF is in the range from 0.7 to 1 .4. 

14. A method according to claim 13. wherein the value 
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at zero Hertz of the frequency domain description 
of the transfer function of the filters simulating the 
HTF is in the range from 0.8 to 1 .2. 

15. A method according to claim 14, wherein the value s 
at zero Hertz of the frequency domain description 
of the transfer function of the filters simulating the 
HTF is in the range from 0.9 to 1 .1 . 

16. A method according to claim 15. wherein the value io 
at zero Hertz of the frequency domain description 
of the transfer function of the filters simulating the 
HTF is in the range from 0.95 to 1 .05. 

1 7. A method according to any of the preceding claims, 75 
wherein the HTF has been determined using at 
least one of the following measures a)-h): 

a) the sound pressure P2 from a spatially 
arranged sound source has been measured at 20 
the entrance, or close to the entrance, to the 
blocked ear canal of a person or of an artificial 
head, 

b) the sound pressure p^ from the sound 25 
source has been measured at a position 
between the ears of the test person or of the 
artificial head, with the test person or the artifi- 
cial head absent, 

30 

c) the frequency domain description of the HTF 
has been calculated by dividing the frequency 
domain description of P2 by the frequency 
domain description of pi . optionally followed by 
low-pass filtering, 35 

d) the time domain description of the HTF has 
been obtained by Inverse Fourier transforma- 
tion of the frequency domain description, 

40 

e) for a particular direction in relation to the test 
person or the artificial head, the left and right 
ear parts of the HTF have been measured 
simultaneously. 

45 

0 the test person has been standing during the 
measurement of the HTF, 

g) the test person has been monitored by visual 
means such as video to ensure that the posi- so 
tion of the head of the test person was not 
changed during the measurement of the HTF 
and/or any measurement of an HTF during 
which the position of the head differed from the 
con-ect position has been discarded, 55 

h) the test person himself monitored the posi- 
tion of his head e.g. by means of mirrors or a 



video monitor In order to keep his head in the 
correct position during measurement of the 
HTF. 

i) the measurements were carried out in an 
anechoic chamber, the measurement time for 
one HTF being at the most 5 seconds, prefera- 
bly at the most 3 seconds, more preferably at 
the most 2 seconds, such as about 1.5 sec- 
onds. 

18. A method according to claim 17. wherein the refer- 
ence point Is at most 0.8 cm from the entrance to 
the blocked ear canal. 

19. A method according to claim 18, wherein the refer- 
ence point Is at most 0.6 cm from the entrance to 
the blocked ear canal. 

20. A method according to claim 19, wherein the refer- 
ence point is at most 0.3 cm from the entrance to 
the blocked ear canal. 

21. A method according to claim 20, wherein the refer- 
ence point is at the entrance to the blocked ear 
canal. 

22. A method according to any of the preceding claims, 
wherein the HTF has been obtained from HTFs (B) 
for at least two test objects, a test object being a 
person or an artificial head, 
by selecting 

a) an HTF which, when used in binaural syn- 
thesis, gives a sound impression which, when 
presented to a test panel, is found to give a 
high degree of conformity with real life listening 
to a sound source in the direction In question, 
or 

b) an HTF which, when described objectively, 
e.g. in the frequency or the time domain, shows 
a high degree of similarity to Individual HTFs of 
a population. 

23. A method according to claim 22, wherein the HTFs 
relating to at least two angles of sound inddence 
have been individually selected among HTFs (B). 

24. A method according to any of claims 1-21, wherein 
the HTF has been obtained from HTFs (B) for at 
least two test objects, a test object being a person 
or an artificial head, the test objects optionally 
being selected according to claim 22 or 23. 

by averaging, in the frequency domain, the 
amplitude of the HTFs (B), the amplitude aver- 
aging being performed, e.g., on pressure, 
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power or logarithmic basis, followed by mini- 
mum phase or zero phase construction to 
obtain an HTF, 
or 

by averaging in the time domain or in the fre- s 
quency domain 

a) the time-aligned HTFs (B), the time 
alignment being performed, e.g.. by 

10 

1) alignment to the onset of the pulse 
or to the first peak, or 

2) alignment to maximum cross-corre- 
lation, or 15 

b) the HTFs (B) from which the linear 
phase part and/or the all-pass phase part 
has been renrx>ved. 

20 

the averaging being optionally followed by addition 
of linear phase components giving an interaural 
time difference, the linear phase components or the 
interaural time difference suitably being obtained in 
a separate averaging of the linear phase compo- 25 
nents or the interaural time differences of the origi- 
nal HTFs (B). 

25. A method according to claim 24, wherein the fre- 
quency axis, or a section or sections thereof, or the 30 
time axis, or a section or sections thereof, has/have 
been compressed or expanded individually for each 
HTF to reduce the differences between the HTFs 
before the averaging. 

35 

26. A method according to any of claims 1-23, wherein 
the HTF has been obtained from HTFs (B) for at 
least two test objects, a test object being a person 
or an artificial head, by averaging characteristic 
parameters of the HTFs (B), the characteristic 4o 
parameters for instance being 

the frequency and the amplitude of characteris- 
tic points, e.g. peaks or notches, or the fre- 
quency of 3 dB points of peaks or notches, 45 
when the HTFs (B) are described in the fre- 
quency domain, 
or 

the time and the amplitude of characteristic so 
points, e.g. a characteristic positive peak or a 
characteristic negative peak, or the time of a 
characteristic zero crossing, when the HTFs 
are described in the time domain, 
or 55 

the coordinates of, or the characteristic fre- 
quency and the Q-factor of poles and zeroes, 



when the HTFs are described in the complex s- 
or z-domain. 

27. A method according to any of the preceding claims, 
wherein the HTF 

a) has been selected from the group consisting 
of the 97 HTFs shown in each of Fig. 1 , Fig. 2 
and Fig. 3, optionally truncated according to 
any of claims 1-7, optionally followed by an 
adjustment of the DC-component to conform 
with any of daims 11-16. or 

b) has been obtained by interpolation between 
two or more of the 97 HTFs shown in each of 
Fig. 1, Fig. 2 and Fig. 3. optionally truncated 
according to any of claims 1-7. optionally fol- 
lowed by an adjustment of the DC-component 
to conform with any of claims 1 1 -1 6, or which 

c) when used for binaural synthesis gives an 
audible Impression which is not clearly different 
from the impression given by an HTF (C) 
according to a) or b), 

the term clearly different meaning that a panel 
of inexperienced listeners obtain a score oif at 
least 90 per cent correct answers, when the 
HTF is compared to an HTF (C) in a balanced 
four-alternative-forced-choice test, using pro- 
gramme material for which the binaural signals 
are used, or for which the binaural signals are 
intended to be used. 

28. A method according to claim 27 c), wherein the 
term clearly different means that the panel of inex- 
perienced listeners obtain a score of at least 80 per 
cent correct answers. 

29. A method according to claim 28. wherein the term 
clearly different means that the panel of inexperi- 
enced listeners obtain a score of at least 70 per 
cent correct answers. 

30. A method according to claim 29, wherein the term 
clearly different means that the panel of inexperi- 
enced listeners obtain a score of at least 50 per 
cent correct answers. 

31. A method according to any of the preceding claims, 
wherein the HTF is adapted to an Individual listener 
or a group of listeners, comprising modifying the 
interaural time difference of the HTF. the modifica- 
tion being based on 

a) the physical dimension of the listener or the 
listeners, such as head diameter, distance 
between the ears. etc.. or 
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; . b) a psychoacoustic experiment, where the 
HTF is used for binaural synthesis, and the 
interaural time difference Is adjusted so that the 
. sound impression as perceived by the indlvid- 
' - . ual listener or the group of listeners is found to 
. . give a high degree of conformity with real life 
listening to a sound source In the direction 
intended. 

32;':'A method according to any of the preceding claims, 
' j'^wherein the HTF has been obtained as an approxi- 
% ,^'Xmate HTF for any specific angle of sound incidence, 
" =' -by interpolating neighbouring HTFs, the interpola- 
tion being carried out as a weighted average of 
neighbouring HTFs. 

33. A method according to claim 32, wherein the aver- 
: aging procedure is an averaging procedure as 
* claimed in any of claims 24-26. 
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34. 



A method according to any of the preceding claims, 
wherein the HTF has been obtained as an approxi- 
mate HTF on the basis of a nearby HTF (B). by per- 
forming an adjustment of the linear phase of the 
HTF, (B) to obtain substantially the interaural time 
difference pertaining to the angle of incidence for 
which the approximate HTF is intended. 



35. A method of obtaining an approximate HTF for a 
. short distance between the listener and the sound 
r : source for use in methods according to any of the 
J preceding claims, comprising 

/ -a) combining 

the left ear part of an HTF representing the 
-. geometric angle from the source position 

' t f- «• to the left ear position or optionally, if the 
left ear is not visible from the source posi- 
tion, the geometric angle from the source 

Tin . ; V* . position tangentially to the part of the head 

' ■ • ' obscuring the ear. with 



the right ear part of an HTF representing 
the geometric angle from the source posi- 
tion to the right ear position or optionally, if 
the right ear is not visible from the source 
position, the geometric angle from the 
source position tangentially to the part of 
the head obscuring the ear, 

• and/or 

r individually adjusting the level of the left ear 
, : \ and the right ear parts of the HTF 

36. A method according to claim 35. wherein the indi- 
/ vidual adjustment of the level of the left ear and the 
'. right ear parts of the HTF is performed in accord- 



ance with the distance law for spherical sound 
waves, using the geometrical distance to each of 
the two ears or optionally, where an ear is not visi- 
ble from the source position, the geometrical dis- 
tance to the tangent point of the part of the head 
obscuring the ear, or to the ear passing the tangent 
point and following the curvature of the head. 

37. A method of generating binaural signals, when per- 
formed as claimed In any of claims 1-34 using a 
HTF produced according to claim 35 or 36. 

38. A method of generating binaural signals by filtering 
at least one sound input with one set of two filters, 
the set of two filters having been obtained from an 
HTF as characterized in any of the preceding 
claims by further processing, such as filtering, 
equalizing, delaying, modelling, or any other 
processing that maintains the Information contents 
inherent in the original HTF. the said further 
processing being substantially identical for the left 
and right ear parts of the HTF 

39. A method of generating binaural signals by filtering 
at least one sound input with at least two sets of two 
filters, the sets of two filters having been obtained 
from HTFs as characterized in any of the preceding 
claims by further processing, such as filtering, 
equalizing, delaying, modelling, or any other 
processing that maintains the information contents 
inherent in the original set of HTFs, the said further 
processing being substantially identical for the vari- 
ous angles, but not necessarily being substantially 
identical for the left and right ear parts of the sets of 
35 HTFs. 

40. A method according to claim 38 or 39, wherein the 
signal processing has been performed so that 

a) the HTF of a specific angle, e.g. in the frontal 
plane, has a flat frequency response, or 

b) the amplitude of a binaural signal formed by 
binaural synthesis of a diffuse sound field is 
substantially identical to the amplitude of the 
diffuse sound field itself, or 

c) the amplitude of a binaural signal fonned by 
binaural synthesis of a specific sound field is 
substantially identical to the amplitude of the 
sound field at the p^ reference point. 

41. A method according to any of the preceding claims, 
wherein at least two sound inputs (1) are combined 
into one sound input (2) which is filtered with one 
set of two filters simulating an HTF 
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A method according to claim 41. wherein the sound 
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inputs (1) which are combined are sound inputs 
belonging together in spatial groups, such as "from 
the front", "from behind", "from the right side", "from 
the left side", etc.. in relation to the listener. 

43. A method according to any of the preceding claims, 
wherein the binaural signals are supplemented with 
supplementing signals corresponding to reflections 
and/or reverberations, optionally filtered by appro- 
priate HTFs. 

44. A method according to any of the preceding claims, 
wherein the at least one sound Input is filtered with 
at least two sets of two filters, each set of two filters 
having been designed so that the two filters simu- 
late the left ear and the right ear parts of a Head- 
related Transfer Function (HTF). 

45. A method according to claim 44. wherein the at 
least one sound input is filtered with at least three 
sets of two filters, each set of two filters having been 
designed so that the two filters simulate the left ear 
and the right ear parts of a Head-related Transfer 
Function (HTF). 

46. A method according to any of the preceding claims, 
wherein the binaural signals are used for simulation 
of a sound field of a specific environment, such as a 
room. e.g. a concert hall, wherein transmission of 
sound from a set of sound sources with specific 
positions in said environment to a receiving point 
with a specific position in said environment is simu- 
lated by 

a) forming, for each of a number of transmis- 
sion paths for each sound source, a binaural 
signal (A), and 

b) combining the binaural signals (A) for each 
sound source into a binaural signal (B). and 

c) combining the binaural signals (B) of the set 
of sound sources into a resulting binaural sig- 
nal (C). 

47. A method for noise measurement and/or assess- 
ment of the effect of noise, or any other measure- 
ment and/or simulation where a description of a 
sound transmission is involved, comprising using 
binaural signals produced according to any of 
claims 1 -34 or claims 38-45 and/or HTFs as char- 
acterized in any of claims 8-10 or claims 17-36, 

48. A method according to any of the preceding claims, 
further comprising sensing position and/or orienta- 
tion, and/or changes in position and/or orientation, 
of the head of a listener and modifying the elec- 
tronic signal processing In dependence of the 



sensed position and/or orientation and/or changes 
in position and/or orientation. 

49. A method for the sensing of the position and/or ori- 
s entation, and/or changes in position and/or orienta- 

tion, of the head of a listener, for use in connection 
with the method of claim 48. comprising 

a) transmitting at least one pulse of energy, 
^0 such as an ultrasonic wave pulse or an infrared 

light pulse, adapted to be received by one or 
more receiving means mounted at and follow- 
ing the movements of the head of the listener. 

15 b) detecting the arrival time or each of the 

arrival times of the transmitted energy pulse or 
pulses at the receiving means or each of the 
recaving means and optionally detecting or 
recording the time of transmission or each of 

20 the times of transmission from the correspond- 

ing transmitter or transmitters, and 

c) calculating the position and/or orientation of 
the head of the listener based on the detected 
25 arrival time or times and optionally on the 

detected or recorded time or times of ti ansitiis- 
sions. 

50. A method according to any of claims '4S-49, 

30 wherein the modification of the electronic signal I 
processing is adapted to impart to the listener the . * 

perception that virtual sound sources remain, in 
position irrespective of the position and/or orienia- 
tion, and/or changes in position and/or orientation. 

35 Of the listener's head. 

51. A method according to any of claims 48-50, 
wherein the signal processing is modified using the 
approximation method of daim 34. 

40 

52. A method according to any of the preceding claims, 
further comprising transmitting the binaural signals 
in the form of modulated ultrasonic waves, the 
waves being received by a listener equipped v^ith 

45 two receiving means each of which is mounted 
close to the appertaining ear of the listener, 
changes in orientation of the listener's head relative 
to a reference orientation being compensated on 
the basis of the difference of the travel time of the 

50 ultrasonic wave pulses between the two receiving 
means so that the listener will perceive that virtual 
sound sources remain in a reference position irre- 
spective of the orientation of the listener's head, the 
compensation being automatic or carried out by 

55 involving electronic signal processing. 

53. A method of generating binaural signals according 
to any of the preceding claims, wherein the sound 
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Inputs to be filtered by Head-related Transfer Func- 
tions are 

signals (Ai. An) of at least one single channel 
communication system and/or at least one mul- 
tichannel communication system which signals 
are adapted for being supplied to at least one 
signal-to-sound transducer, or 
signals which are adapted for being decoded 
Into such signals (Av-An). 

so that the binaural signal, when reproduced, is 
capable of Imparting to a listener a perception of lis- 
tening to a spatial sound field with a set of n individ- 
ually positioned virtual sound sources, each of 
which transmits one of the signals (A-| ..An). 

54. A method according to claim 53, wherein the posi- 
tion and orientation of the receiver's head is moni- 
tored, and head position and head orientation data 
obtained in the monitoring is used to enable the 
receiver to selectively transmit a message to one of 
the transmitters corresponding to one of the signals 
(A^ . An) by turning his head in the direction of the 
virtual sound source corresponding to said trans- 
mitter. 

55- A method according to claim 53 or 54, wherein the 
sound inputs to be filtered by Head-related Transfer 
Functions are generated in connection with moni- 
toring and/or controlling and/or communicating with 
a multitude of units, such as in air traffic control, in 
control of cabs or trucks, in messenger offices, in 
life saving stations, in central offices of watchmen, 
in telephone meetings, In meetings using audio-vis- 
ual communication means, etc. 

56. A method of generating binaural signals according 
to any of claims 1-48. wherein the sound inputs to 
be filtered by Head-related Transfer Functions are 

signals (A-j.-An) of a multichannel sound repro- 
ducing system which signals are adapted for 
being supplied to n different signal-to-sound 
transducers of the multichannel sound repro- 
ducing system, or 

signals which are adapted for being decoded 
Into such signals (A-j.-An), 

so that the binaural signal, when reproduced, is 
capable of imparting to a listener a perception of lis- 
tening to a spatial sound field similar to the sound 
field which would have resulted from listening to the 
n signai-to-sound transducers spatially arranged in 
a room. 

57. A method according to claim 56, wherein the mul- 
tichannel sound reproducing system is a Dolby Sur- 



round System or any N channel sound system 
pertaining to HDTV. 

58. A method according to claim 56 or 57, wherein the 
5 multichannel sound reproducing system is a Stereo 

system. 

59. A method according to any of the previous claims 1 - 
34 or 37-45, wherein the binaural signals are used 

10 for positioning a set of sounds at specific virtual 
positions In relation to an operator. 

60. A method according to claim 58, wherein a moving 
virtual sound source with a characteristic sound 

15 moves continuously or discontinuously between 
specific positions of a set of virtual sound sources, 
the operator being enabled to communicate a spe- 
cific message to the system according to a particu- 
lar virtual sound source by prompting the system 

20 when the moving virtual sound source Is positioned 
substantially at the position of said virtual sound 
source. 

61. A method according to claim 60, wherein the posi- 
25 tion of the moving virtual sound source is controlled 

by the operator. 

62. A method according to claim 60 or 61, wherein the 
position of the moving virtual sound source Is con- 

30 trolled by the orientation and/or position of the head 
of the operator. 

63. A method according to any of claims 59-62, 
wherein the positions are dynamically controlled by 

35 a computer 

64. A method according to claim 63, when used for 
controlling or assisting the movement and/or posi- 
tion of an object and/or a living being by dynami- 

40 cally positioning a virtual sound source in relation to 
the obiect and/or living being, so as to guide the 
object and/or the living being in relation to the posi- 
tion of the virtual sound source. 

45 65. A method according to any of the preceding claims, 
further comprising compensation of transfer char- 
acteristics of a signat-to-sound transducer. 

66. A method according to claim 65, wherein sound 
so pressure at the entrance, or close to the entrance, 

to the blocked ear canal is considered as the output 
of the signal-to-sound transducer. 

67. A method according to any of the preceding claims, 
55 wherein the binaural signal is emitted by means of 

headphones. 

68. A method according to claim 67, wherein the binau- 
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ral signal is transmitted to the headphones by wire- 
less means. 

69. A method according to claims 66-68. further com- 
prising compensation for the difference in pressure s 
division at the input to the ear canal when the ear is 
occluded, respectively unoccluded. by a head- 
phone. 

70. A method according to claim 69. wherein a descrip- to 
tion of the difference in pressure division at the 
input to the ear canal when the ear is occluded, 
respectively unoccluded. by a headphone, is 
obtained by measuring the transmission from the 
headphone to the sound pressure is 

at the entrance, or close to the entrance, of the 
blocked ear canal, and 

at the entrance, or close to the entrance, of the 
open ear canal. 20 

the ratio of the frequency domain descriptions of 
these transmissions being obtained as characteris- 
tic of the pressure division (X) in this situation, 
and 25 
measuring the transmission from a sound source 
that does not influence the acoustic radiation 
impedance of the ear. to the sound pressure 

at the entrance, or close to the entrance, of the 30 
blocked ear canal, and 

at the entrance, or close to the entrance, of the 
open ear canal. 

the ratio of the frequency domain descriptions of 35 
these transmissions being obtained as characteris- 
tic of the pressure division (Y) in this situation, 
and obtaining the ratio X/Y which constitutes the 
frequency domain description of the difference in 
pressure division. 40 

71 . A method according to any of claims 1 -66, wherein 
the binaural signal is emitted by means of loud- 
speakers, optionally having crosstalk counteracted 

by supplementing the binaural signal with artificial 45 
electrical crosstalk compensation signals. 

72. A method according to any of claims 65-71. 
wherein the compensation, or the crosstalk coun- 
teraction, is adapted to the individual listener. so 



senting a combination of more than one sound 
inputs (1 ) is stored or broadcast separately, such as 
in a separate track or in a separate channel, 
respectively, the binaural filtering being carried out 
before or after storing or broadcasting. 

75. A method of computer modelling or analysing the 
cerebral human binaural sound localization ability, 
comprising using binaural signals obtained accord- 
ing to any of previous claims or HTFs according to 
any of claims 8-10 or claims 1 7-33 or claims 35-36. 

76. A method for designing headphones, comprising 
adapting the transfer characteristics thereof to 
resemble an HTF as characterized in any of claims 
8-10 or claims 17-36 for a given direction, e.g., the 
frontal direction, or to resemble weighted averages 
of such HTFs corresponding to averages of given 
directions. 

77. An artificial head having HTFs which correspond 
substantially to HTFs according to any of claims 8- 
10 or claims 17-33 or claims 35-36 for all angles of 
sound incidence, or at least for angles of sound 
incidence which constitute part of the total sphere 
surrounding the artificial head, such as the upper 
hemisphere or the frontal region. 

<. 

78. A method for producing an artificial head according 
to claim 77, comprising adapting the geometric 
characteristics of the artificial head and/or the 
acoustic properties of the materials used so as to 
approximate the HTFs of the artificial head to HJFs 
according to any of claims 8-10 or claims 17-33 or 
claims 35-36 for all angles of sound incidence, or at 
least for angles of sound incidence which constitute 
part of the total sphere surrounding the artificial 
head, such as the upper hemisphere or the frontal 
region. 

79. A method according to any of the preceding claims, 
wherein the two filters simulating the left ear and 
right ear parts of the HTF are discrete time filters. 

80. A method according to any of the preceding claims, 
wherein the two filters simulating the left ear and 
right ear parts of the HPT are digital filters. 



73. A method according to any of the preceding claims, 
wherein the binaural signal is stored on an audio 
storage medium or broadcast. 

74. A method according to claim 41-46 in combination 
with claim 73, wherein each sound input (2) to be fil- 
tered by Head-related Transfer Functions repre- 
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(54) Binaural synthesis, head-related transfer functions, and uses thereof 

(57) The invention relates to improved methods and 
apparatus for simulating the transmission of sound from 
sound sources to the ear canals of a listener, said sound 
sources being positioned arbitrarily in three dimensions 
in relation to the listener In particular, the invention 
relates to new and improved methods for measurement 
of Head -related Transfer Functions, new and Improved 
Head-related Transfer Functions, new and improved 
methods for processing Head-related Transfer Func- 
tions, and new methods of changing, or of maintaining, 
the directions of the sound sources as perceived by a 
listener. The measurement method have been improved 
so that it is now possible to measure and/or construct 
Head-related Transfer Functions for which the time 
domain descriptions are surprisingly short and for which 
the differences from one individual to the other are sur- 
prisingly low. 
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